1. Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing.
- Author
-
Gandomi A, Hasan E, Chusid J, Paul S, Inra M, Makhnevich A, Raoof S, Silvestri G, Bade BC, and Cohen SL
- Subjects
- Humans, Retrospective Studies, Radiology Information Systems standards, Early Detection of Cancer, Male, Female, Aged, Natural Language Processing, Lung Neoplasms diagnostic imaging, Tomography, X-Ray Computed
- Abstract
Introduction: Radiology scoring systems are critical to the success of lung cancer screening (LCS) programs, impacting patient care, adherence to follow-up, data management and reporting, and program evaluation. LungCT ScreeningReporting and Data System (Lung-RADS) is a structured radiology scoring system that provides recommendations for LCS follow-up that are utilized (a) in clinical care and (b) by LCS programs monitoring rates of adherence to follow-up. Thus, accurate reporting and reliable collection of Lung-RADS scores are fundamental components of LCS program evaluation and improvement. Unfortunately, due to variability in radiology reports, extraction of Lung-RADS scores is non-trivial, and best practices do not exist. The purpose of this project is to compare mechanisms to extract Lung-RADS scores from free-text radiology reports., Methods: We retrospectively analyzed reports of LCS low-dose computed tomography (LDCT) examinations performed at a multihospital integrated healthcare network in New York State between January 2016 and July 2023. We compared three methods of Lung-RADS score extraction: manual physician entry at time of report creation, manual LCS specialist entry after report creation, and an internally developed, rule-based natural language processing (NLP) algorithm. Accuracy, recall, precision, and completeness (i.e., the proportion of LCS exams to which a Lung-RADS score has been assigned) were compared between the three methods., Results: The dataset includes 24,060 LCS examinations on 14,243 unique patients. The mean patient age was 65 years, and most patients were male (54 %) and white (75 %). Completeness rate was 65 %, 68 %, and 99 % for radiologists' manual entry, LCS specialists' entry, and NLP algorithm, respectively. Accuracy, recall, and precision were high across all extraction methods (>94 %), though the NLP-based approach was consistently higher than both manual entries in all metrics., Discussion: An NLP-based method of LCS score determination is an efficient and more accurate means of extracting Lung-RADS scores than manual review and data entry. NLP-based methods should be considered best practice for extracting structured Lung-RADS scores from free-text radiology reports., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors have no conflicts of interest related to the submission of this manuscript. Dr. Bade is a site PI on industry-sponsored trials for Nucleix, Delfi Diagnostics, and Biodesix., (Copyright © 2024 Elsevier B.V. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF