1. Automatic Normalization of Anatomical Phrases in Radiology Reports Using Unsupervised Learning
- Author
-
Paul J. Chang, Peter Prinsen, Pritesh Patel, Gabriel Ryan Mankovich, Amir M. Tahmasebi, Rob van Ommering, Henghui Zhu, Martin L. Gunn, Prescott Klassen, and Sam Pilato
- Subjects
Normalization (statistics) ,medicine.medical_specialty ,Word embedding ,Computer science ,Human error ,Article ,Workflow ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Terminology as Topic ,medicine ,Electronic Health Records ,Humans ,Radiology, Nuclear Medicine and imaging ,Word2vec ,SNOMED CT ,Radiological and Ultrasound Technology ,Recall ,Concept map ,Computer Science Applications ,Unsupervised learning ,Radiology ,030217 neurology & neurosurgery ,Unsupervised Machine Learning - Abstract
In today's radiology workflow, free-text reporting is established as the most common medium to capture, store, and communicate clinical information. Radiologists routinely refer to prior radiology reports of a patient to recall critical information for new diagnosis, which is quite tedious, time consuming, and prone to human error. Automatic structuring of report content is desired to facilitate such inquiry of information. In this work, we propose an unsupervised machine learning approach to automatically structure radiology reports by detecting and normalizing anatomical phrases based on the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) ontology. The proposed approach combines word embedding-based semantic learning with ontology-based concept mapping to derive the desired concept normalization. The word embedding model was trained using a large corpus of unlabeled radiology reports. Fifty-six anatomical labels were extracted from SNOMED CT as class labels of the whole human anatomy. The proposed framework was compared against a number of state-of-the-art supervised and unsupervised approaches. Radiology reports from three different clinical sites were manually labeled for testing. The proposed approach outperformed other techniques yielding an average precision of 82.6%. The proposed framework boosts the coverage and performance of conventional approaches for concept normalization, by applying word embedding techniques in semantic learning, while avoiding the challenge of having access to a large amount of annotated data, which is typically required for training classifiers.
- Published
- 2018
- Full Text
- View/download PDF