Start Over

Assessing generalizability of an AI-based visual test for cervical cancer screening.

Authors :: Ahmed SR
Egemen D
Befano B
Rodriguez AC
Jeronimo J
Desai K
Teran C
Alfaro K
Fokom-Domgue J
Charoenkwan K
Mungo C
Luckett R
Saidu R
Raiol T
Ribeiro A
Gage JC
de Sanjose S
Kalpathy-Cramer J
Schiffman M
Source :: PLOS digital health [PLOS Digit Health] 2024 Oct 02; Vol. 3 (10), pp. e0000364. Date of Electronic Publication: 2024 Oct 02 (Print Publication: 2024).
Publication Year :: 2024
Abstract: A number of challenges hinder artificial intelligence (AI) models from effective clinical translation. Foremost among these challenges is the lack of generalizability, which is defined as the ability of a model to perform well on datasets that have different characteristics from the training data. We recently investigated the development of an AI pipeline on digital images of the cervix, utilizing a multi-heterogeneous dataset of 9,462 women (17,013 images) and a multi-stage model selection and optimization approach, to generate a diagnostic classifier able to classify images of the cervix into "normal", "indeterminate" and "precancer/cancer" (denoted as "precancer+") categories. In this work, we investigate the performance of this multiclass classifier on external data not utilized in training and internal validation, to assess the generalizability of the classifier when moving to new settings. We assessed both the classification performance and repeatability of our classifier model across the two axes of heterogeneity present in our dataset: image capture device and geography, utilizing both out-of-the-box inference and retraining with external data. Our results demonstrate that device-level heterogeneity affects our model performance more than geography-level heterogeneity. Classification performance of our model is strong on images from a new geography without retraining, while incremental retraining with inclusion of images from a new device progressively improves classification performance on that device up to a point of saturation. Repeatability of our model is relatively unaffected by data heterogeneity and remains strong throughout. Our work supports the need for optimized retraining approaches that address data heterogeneity (e.g., when moving to a new device) to facilitate effective use of AI models in new settings.<br />Competing Interests: The authors have declared that no competing interests exist.<br /> (Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.)

Details

Language :: English
ISSN :: 2767-3170
Volume :: 3
Issue :: 10
Database :: MEDLINE
Journal :: PLOS digital health
Publication Type :: Academic Journal
Accession number :: 39356713
Full Text :: https://doi.org/10.1371/journal.pdig.0000364

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Assessing generalizability of an AI-based visual test for cervical cancer screening.

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Assessing generalizability of an AI-based visual test for cervical cancer screening.

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources