1. Clinical application of radiological AI for pulmonary nodule evaluation: Replicability and susceptibility to the population shift caused by the COVID-19 pandemic.
- Author
-
Vasilev Y, Vladzymyrskyy A, Arzamasov K, Omelyanskaya O, Shulkin I, Kozikhina D, Goncharova I, Reshetnikov R, Chetverikov S, Blokhin I, Bobrovskaya T, and Andreychenko A
- Subjects
- Humans, Pandemics, Radiography, Tomography, X-Ray Computed, COVID-19 diagnostic imaging, COVID-19 epidemiology, Radiology
- Abstract
Purpose: replicability and generalizability of medical AI are the recognized challenges that hinder a broad AI deployment in clinical practice. Pulmonary nodes detection and characterization based on chest CT images is one of the demanded use cases for automatization by means of AI, and multiple AI solutions addressing this task are becoming available. Here, we evaluated and compared the performance of several commercially available radiological AI with the same clinical task on the same external datasets acquired before and during the pandemic of COVID-19., Approach: 5 commercially available AI models for pulmonary nodule detection were tested on two external datasets labelled by experts according to the intended clinical task. Dataset1 was acquired before the pandemic and did not contain radiological signs of COVID-19; dataset2 was collected during the pandemic and did contain radiological signs of COVID-19. ROC-analysis was applied separately for the dataset1 and dataset2 to select probability thresholds for each dataset separately. AUROC, sensitivity and specificity metrics were used to assess and compare the results of AI performance., Results: Statistically significant differences in AUROC values were observed between the AI models for the dataset1. Whereas for the dataset2 the differences of AUROC values became statistically insignificant. Sensitivity and specificity differed statistically significantly between the AI models for the dataset1. This difference was insignificant for the dataset2 when we applied the probability threshold initially selected for the dataset1. An update of the probability threshold based on the dataset2 created statistically significant differences of sensitivity and specificity between AI models for the dataset2. For 3 out of 5 AI models, the update of the probability threshold was valuable to compensate for the degradation of AI model performances with the population shift caused by the pandemic., Conclusions: Population shift in the data is able to deteriorate differences of AI models performance. Update of the probability threshold together with the population shift seems to be valuable to preserve AI models performance without retraining them., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2023 Elsevier B.V. All rights reserved.)
- Published
- 2023
- Full Text
- View/download PDF