1. Impact of Original and Artificially Improved Artificial Intelligence–based Computer-aided Diagnosis on Breast US Interpretation
- Author
-
Andriy I. Bandos, Cathy S. Tyma, Terri-Ann Gizienski, Katie M Davis, Grace Y. Rathfon, David Gur, Wendie A. Berg, Christiane M. Hakim, Bronwyn E. Nair, Gordon S. Abrams, Uzma X Waheed, and Amar S Mehta
- Subjects
03 medical and health sciences ,0302 clinical medicine ,Radiological and Ultrasound Technology ,Breast imaging ,Computer-aided diagnosis ,business.industry ,Computer science ,030220 oncology & carcinogenesis ,Interpretation (philosophy) ,Radiology, Nuclear Medicine and imaging ,Artificial intelligence ,business ,030218 nuclear medicine & medical imaging - Abstract
Objective For breast US interpretation, to assess impact of computer-aided diagnosis (CADx) in original mode or with improved sensitivity or specificity. Methods In this IRB approved protocol, orthogonal-paired US images of 319 lesions identified on screening, including 88 (27.6%) cancers (median 7 mm, range 1–34 mm), were reviewed by 9 breast imaging radiologists. Each observer provided BI-RADS assessments (2, 3, 4A, 4B, 4C, 5) before and after CADx in a mode-balanced design: mode 1, original CADx (outputs benign, probably benign, suspicious, or malignant); mode 2, artificially-high-sensitivity CADx (benign or malignant); and mode 3, artificially-high-specificity CADx (benign or malignant). Area under the receiver operating characteristic curve (AUC) was estimated under each modality and for standalone CADx outputs. Multi-reader analysis accounted for inter-reader variability and correlation between same-lesion assessments. Results AUC of standalone CADx was 0.77 (95% CI: 0.72–0.83). For mode 1, average reader AUC was 0.82 (range 0.76–0.84) without CADx and not significantly changed with CADx. In high-sensitivity mode, all observers’ AUCs increased: average AUC 0.83 (range 0.78–0.86) before CADx increased to 0.88 (range 0.84–0.90), P < 0.001. In high-specificity mode, all observers’ AUCs increased: average AUC 0.82 (range 0.76–0.84) before CADx increased to 0.89 (range 0.87–0.92), P < 0.0001. Radiologists responded more frequently to malignant CADx cues in high-specificity mode (42.7% vs 23.2% mode 1, and 27.0% mode 2, P = 0.008). Conclusion Original CADx did not substantially impact radiologists’ interpretations. Radiologists showed improved performance and were more responsive when CADx produced fewer false-positive malignant cues.
- Published
- 2021