1. Assessing the Accuracy of a Deep Learning Method to Risk Stratify Indeterminate Pulmonary Nodules
- Author
-
Jerome Declerck, Sarim Ather, Jonas Kunst, Catarina Santos, Gary T. Smith, Sanja L. Antic, Carlos Arteta, David Dufek, Fergus V. Gleeson, W. Hickes, Jan Brabec, Pierre P. Massion, Heiko Peschl, P. Novotny, L. Pickup, Bennett A. Landman, Timor Kadir, Heidi Chen, A Talwar, and Reginald F. Munden
- Subjects
Pulmonary and Respiratory Medicine ,medicine.medical_specialty ,Lung Neoplasms ,risk stratification ,Critical Care and Intensive Care Medicine ,Malignancy ,03 medical and health sciences ,Deep Learning ,0302 clinical medicine ,medicine ,computer-aided image analysis ,Humans ,030212 general & internal medicine ,early detection ,Lung cancer ,Receiver operating characteristic ,business.industry ,Deep learning ,Original Articles ,neural networks ,medicine.disease ,Confidence interval ,3. Good health ,lung cancer ,Lung Cancer and Oncological Disorders ,030228 respiratory system ,Multiple Pulmonary Nodules ,National Lung Screening Trial ,Radiology ,Artificial intelligence ,business ,Indeterminate ,Precancerous Conditions ,Clinical risk factor - Abstract
Rationale: The management of indeterminate pulmonary nodules (IPNs) remains challenging, resulting in invasive procedures and delays in diagnosis and treatment. Strategies to decrease the rate of unnecessary invasive procedures and optimize surveillance regimens are needed. Objectives: To develop and validate a deep learning method to improve the management of IPNs. Methods: A Lung Cancer Prediction Convolutional Neural Network model was trained using computed tomography images of IPNs from the National Lung Screening Trial, internally validated, and externally tested on cohorts from two academic institutions. Measurements and Main Results: The areas under the receiver operating characteristic curve in the external validation cohorts were 83.5% (95% confidence interval [CI], 75.4–90.7%) and 91.9% (95% CI, 88.7–94.7%), compared with 78.1% (95% CI, 68.7–86.4%) and 81.9 (95% CI, 76.1–87.1%), respectively, for a commonly used clinical risk model for incidental nodules. Using 5% and 65% malignancy thresholds defining low- and high-risk categories, the overall net reclassifications in the validation cohorts for cancers and benign nodules compared with the Mayo model were 0.34 (Vanderbilt) and 0.30 (Oxford) as a rule-in test, and 0.33 (Vanderbilt) and 0.58 (Oxford) as a rule-out test. Compared with traditional risk prediction models, the Lung Cancer Prediction Convolutional Neural Network was associated with improved accuracy in predicting the likelihood of disease at each threshold of management and in our external validation cohorts. Conclusions: This study demonstrates that this deep learning algorithm can correctly reclassify IPNs into low- or high-risk categories in more than a third of cancers and benign nodules when compared with conventional risk models, potentially reducing the number of unnecessary invasive procedures and delays in diagnosis.
- Published
- 2020
- Full Text
- View/download PDF