1. Automated Interstitial Lung Abnormality Probability Prediction at CT: A Stepwise Machine Learning Approach in the Boston Lung Cancer Study.
- Author
-
Hata A, Aoyagi K, Hino T, Kawagishi M, Wada N, Song J, Wang X, Valtchinov VI, Nishino M, Muraguchi Y, Nakatsugawa M, Koga A, Sugihara N, Ozaki M, Hunninghake GM, Tomiyama N, Li Y, Christiani DC, and Hatabu H
- Subjects
- Humans, Retrospective Studies, Female, Male, Aged, Middle Aged, Boston, Lung diagnostic imaging, Probability, Machine Learning, Tomography, X-Ray Computed methods, Lung Diseases, Interstitial diagnostic imaging, Lung Neoplasms diagnostic imaging, Radiographic Image Interpretation, Computer-Assisted methods
- Abstract
Background It is increasingly recognized that interstitial lung abnormalities (ILAs) detected at CT have potential clinical implications, but automated identification of ILAs has not yet been fully established. Purpose To develop and test automated ILA probability prediction models using machine learning techniques on CT images. Materials and Methods This secondary analysis of a retrospective study included CT scans from patients in the Boston Lung Cancer Study collected between February 2004 and June 2017. Visual assessment of ILAs by two radiologists and a pulmonologist served as the ground truth. Automated ILA probability prediction models were developed that used a stepwise approach involving section inference and case inference models. The section inference model produced an ILA probability for each CT section, and the case inference model integrated these probabilities to generate the case-level ILA probability. For indeterminate sections and cases, both two- and three-label methods were evaluated. For the case inference model, we tested three machine learning classifiers (support vector machine [SVM], random forest [RF], and convolutional neural network [CNN]). Receiver operating characteristic analysis was performed to calculate the area under the receiver operating characteristic curve (AUC). Results A total of 1382 CT scans (mean patient age, 67 years ± 11 [SD]; 759 women) were included. Of the 1382 CT scans, 104 (8%) were assessed as having ILA, 492 (36%) as indeterminate for ILA, and 786 (57%) as without ILA according to ground-truth labeling. The cohort was divided into a training set ( n = 96; ILA, n = 48), a validation set ( n = 24; ILA, n = 12), and a test set ( n = 1262; ILA, n = 44). Among the models evaluated (two- and three-label section inference models; two- and three-label SVM, RF, and CNN case inference models), the model using the three-label method in the section inference model and the two-label method and RF in the case inference model achieved the highest AUC, at 0.87. Conclusion The model demonstrated substantial performance in estimating ILA probability, indicating its potential utility in clinical settings. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Zagurovskaya in this issue.
- Published
- 2024
- Full Text
- View/download PDF