Savage CH, Tanwar M, Elkassem AA, Sturdivant A, Hamki O, Sotoudeh H, Sirineni G, Singhal A, Milner D, Jones J, Rehder D, Li M, Li Y, Junck K, Tridandapani S, Rothenberg SA, and Smith AD
BACKGROUND. Retrospective studies evaluating artificial intelligence (AI) algorithms for intracranial hemorrhage (ICH) detection on noncontrast CT (NCCT) have shown promising results but lack prospective validation. OBJECTIVE. The purpose of this article was to evaluate the impact on radiologists' real-world aggregate performance for ICH detection and report turnaround times for ICH-positive examinations of a radiology department's implementation of an AI triage and notification system for ICH detection on head NCCT examinations. METHODS. This prospective single-center study included adult patients who underwent head NCCT examinations from May 12, 2021, to June 30, 2021 (phase 1), or from September 30, 2021, to December 4, 2021 (phase 2). Before phase 1, the radiology department implemented a commercial AI triage system for ICH detection that processed head NCCT examinations and notified radiologists of positive results through a widget with a floating pop-up display. Examinations were interpreted by neuroradiologists or emergency radiologists, who evaluated examinations without and with AI assistance in phases 1 and 2, respectively. A panel of radiologists conducted a review process for all examinations with discordance between the radiology report and AI and a subset of remaining examinations to establish the reference standard. Diagnostic performance and report turnaround times were compared using the Pearson chi-square test and Wilcoxon rank sum test, respectively. Bonferroni correction was used to account for five diagnostic performance metrics (adjusted significance threshold, .01 [α = .05/5]). RESULTS. A total of 9954 examinations from 7371 patients (mean age, 54.8 ± 19.8 [SD] years; 3773 women, 3598 men) were included. In phases 1 and 2, 19.8% (735/3716) and 21.9% (1368/6238) of examinations, respectively, were positive for ICH ( p = . 01). Radiologists without versus with AI showed no significant difference in accuracy (99.5% vs 99.2%), sensitivity (98.6% vs 98.9%), PPV (99.0% vs 97.5%), or NPV (99.7% vs 99.7%) (all p > .01); specificity was higher for radiologists without than with AI (99.8% vs 99.3%, respectively, p = .004). Mean report turnaround time for ICH-positive examinations was 147.1 minutes without AI versus 149.9 minutes with AI ( p = .11). CONCLUSION. An AI triage system for ICH detection did not improve radiologists' diagnostic performance or report turnaround times. CLINICAL IMPACT. This large prospective real-world study does not support use of AI assistance for ICH detection.