1. Generalizability of Deep Learning Classification of Spinal Osteoporotic Compression Fractures on Radiographs Using an Adaptation of the Modified-2 Algorithm-Based Qualitative Criteria
- Author
-
Dong, Qifei, Luo, Gang, Lane, Nancy E, Lui, Li-Yung, Marshall, Lynn M, Johnston, Sandra K, Dabbous, Howard, O'Reilly, Michael, Linnau, Ken F, Perry, Jessica, Chang, Brian C, Renslo, Jonathan, Haynor, David, Jarvik, Jeffrey G, and Cross, Nathan M
- Subjects
Osteoporosis ,Prevention ,Good Health and Well Being ,Deep learning ,Opportunistic screening ,Osteoporotic fracture ,Radiography ,Clinical Sciences ,Nuclear Medicine & Medical Imaging - Abstract
Rationale and objectivesSpinal osteoporotic compression fractures (OCFs) can be an early biomarker for osteoporosis but are often subtle, incidental, and underreported. To ensure early diagnosis and treatment of osteoporosis, we aimed to build a deep learning vertebral body classifier for OCFs as a critical component of our future automated opportunistic screening tool.Materials and methodsWe retrospectively assembled a local dataset, including 1790 subjects and 15,050 vertebral bodies (thoracic and lumbar). Each vertebral body was annotated using an adaption of the modified-2 algorithm-based qualitative criteria. The Osteoporotic Fractures in Men (MrOS) Study dataset provided thoracic and lumbar spine radiographs of 5994 men from six clinical centers. Using both datasets, five deep learning algorithms were trained to classify each individual vertebral body of the spine radiographs. Classification performance was compared for these models using multiple metrics, including the area under the receiver operating characteristic curve (AUC-ROC), sensitivity, specificity, and positive predictive value (PPV).ResultsOur best model, built with ensemble averaging, achieved an AUC-ROC of 0.948 and 0.936 on the local dataset's test set and the MrOS dataset's test set, respectively. After setting the cutoff threshold to prioritize PPV, this model achieved a sensitivity of 54.5% and 47.8%, a specificity of 99.7% and 99.6%, and a PPV of 89.8% and 94.8%.ConclusionOur model achieved an AUC-ROC>0.90 on both datasets. This testing shows some generalizability to real-world clinical datasets and a suitable performance for a future opportunistic osteoporosis screening tool.
- Published
- 2023