Back to Search Start Over

Closing the AI generalization gap by adjusting for dermatology condition distribution differences across clinical settings

Authors :
Rikhye, Rajeev V.
Loh, Aaron
Hong, Grace Eunhae
Singh, Preeti
Smith, Margaret Ann
Muralidharan, Vijaytha
Wong, Doris
Sayres, Rory
Phung, Michelle
Betancourt, Nicolas
Fong, Bradley
Sahasrabudhe, Rachna
Nasim, Khoban
Eschholz, Alec
Mustafa, Basil
Freyberg, Jan
Spitz, Terry
Matias, Yossi
Corrado, Greg S.
Chou, Katherine
Webster, Dale R.
Bui, Peggy
Liu, Yuan
Liu, Yun
Ko, Justin
Lin, Steven
Publication Year :
2024

Abstract

Recently, there has been great progress in the ability of artificial intelligence (AI) algorithms to classify dermatological conditions from clinical photographs. However, little is known about the robustness of these algorithms in real-world settings where several factors can lead to a loss of generalizability. Understanding and overcoming these limitations will permit the development of generalizable AI that can aid in the diagnosis of skin conditions across a variety of clinical settings. In this retrospective study, we demonstrate that differences in skin condition distribution, rather than in demographics or image capture mode are the main source of errors when an AI algorithm is evaluated on data from a previously unseen source. We demonstrate a series of steps to close this generalization gap, requiring progressively more information about the new source, ranging from the condition distribution to training data enriched for data less frequently seen during training. Our results also suggest comparable performance from end-to-end fine tuning versus fine tuning solely the classification layer on top of a frozen embedding model. Our approach can inform the adaptation of AI algorithms to new settings, based on the information and resources available.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.15566
Document Type :
Working Paper