Back to Search Start Over

Shortcut learning in medical AI hinders generalization: method for estimating AI model generalization without external data.

Authors :
Ong Ly, Cathy
Unnikrishnan, Balagopal
Tadic, Tony
Patel, Tirth
Duhamel, Joe
Kandel, Sonja
Moayedi, Yasbanoo
Brudno, Michael
Hope, Andrew
Ross, Heather
McIntosh, Chris
Source :
NPJ Digital Medicine; 5/14/2024, Vol. 7 Issue 1, p1-10, 10p
Publication Year :
2024

Abstract

Healthcare datasets are becoming larger and more complex, necessitating the development of accurate and generalizable AI models for medical applications. Unstructured datasets, including medical imaging, electrocardiograms, and natural language data, are gaining attention with advancements in deep convolutional neural networks and large language models. However, estimating the generalizability of these models to new healthcare settings without extensive validation on external data remains challenging. In experiments across 13 datasets including X-rays, CTs, ECGs, clinical discharge summaries, and lung auscultation data, our results demonstrate that model performance is frequently overestimated by up to 20% on average due to shortcut learning of hidden data acquisition biases (DAB). Shortcut learning refers to a phenomenon in which an AI model learns to solve a task based on spurious correlations present in the data as opposed to features directly related to the task itself. We propose an open source, bias-corrected external accuracy estimate, P<subscript>Est</subscript>, that better estimates external accuracy to within 4% on average by measuring and calibrating for DAB-induced shortcut learning. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
23986352
Volume :
7
Issue :
1
Database :
Complementary Index
Journal :
NPJ Digital Medicine
Publication Type :
Academic Journal
Accession number :
177250225
Full Text :
https://doi.org/10.1038/s41746-024-01118-4