1. Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center.
- Author
-
Hu, Zixuan, Patel, Markand, Ball, Robyn, Lin, Hui, Prevedello, Luciano, Naseri, Mitra, Mathur, Shobhit, Moreland, Robert, Wilson, Jefferson, Witiw, Christopher, Yeom, Kristen, Ha, Qishen, Hanley, Darragh, Seferbekov, Selim, Chen, Hao, Singer, Philipp, Henkel, Christof, Pfeiffer, Pascal, Pan, Ian, Sheoran, Harshit, Li, Wuqi, Flanders, Adam, Kitamura, Felipe, Richards, Tyler, Talbott, Jason, Sejdić, Ervin, and Colak, Errol
- Subjects
CT ,Convolutional Neural Network (CNN) ,Feature Detection ,Genetic Algorithms ,Head/Neck ,Spine ,Supervised Learning ,Technology Assessment ,Humans ,Male ,Cervical Vertebrae ,Middle Aged ,Spinal Fractures ,Trauma Centers ,Tomography ,X-Ray Computed ,Retrospective Studies ,Female ,Sensitivity and Specificity ,Adult ,Contrast Media - Abstract
Purpose To evaluate the performance of the top models from the RSNA 2022 Cervical Spine Fracture Detection challenge on a clinical test dataset of both noncontrast and contrast-enhanced CT scans acquired at a level I trauma center. Materials and Methods Seven top-performing models in the RSNA 2022 Cervical Spine Fracture Detection challenge were retrospectively evaluated on a clinical test set of 1828 CT scans (from 1829 series: 130 positive for fracture, 1699 negative for fracture; 1308 noncontrast, 521 contrast enhanced) from 1779 patients (mean age, 55.8 years ± 22.1 [SD]; 1154 [64.9%] male patients). Scans were acquired without exclusion criteria over 1 year (January-December 2022) from the emergency department of a neurosurgical and level I trauma center. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. False-positive and false-negative cases were further analyzed by a neuroradiologist. Results Although all seven models showed decreased performance on the clinical test set compared with the challenge dataset, the models maintained high performances. On noncontrast CT scans, the models achieved a mean AUC of 0.89 (range: 0.79-0.92), sensitivity of 67.0% (range: 30.9%-80.0%), and specificity of 92.9% (range: 82.1%-99.0%). On contrast-enhanced CT scans, the models had a mean AUC of 0.88 (range: 0.76-0.94), sensitivity of 81.9% (range: 42.7%-100.0%), and specificity of 72.1% (range: 16.4%-92.8%). The models identified 10 fractures missed by radiologists. False-positive cases were more common in contrast-enhanced scans and observed in patients with degenerative changes on noncontrast scans, while false-negative cases were often associated with degenerative changes and osteopenia. Conclusion The winning models from the 2022 RSNA AI Challenge demonstrated a high performance for cervical spine fracture detection on a clinical test dataset, warranting further evaluation for their use as clinical support tools. Keywords: Feature Detection, Supervised Learning, Convolutional Neural Network (CNN), Genetic Algorithms, CT, Spine, Technology Assessment, Head/Neck Supplemental material is available for this article. © RSNA, 2024 See also commentary by Levi and Politi in this issue.
- Published
- 2024