Back to Search
Start Over
EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques
- Source :
- IEEE Journal of Biomedical and Health Informatics. 24:2922-2931
- Publication Year :
- 2020
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2020.
-
Abstract
- Ischemic stroke is a major cause of death and disability in adulthood worldwide. Because it has highly heterogeneous phenotypes, phenotyping of ischemic stroke is an essential task for medical research and clinical prognostication. However, this task is not a trivial one when the study population is large. Phenotyping of ischemic stroke depends primarily on manual annotation of medical records in previous studies. This article evaluated various strategies for automated phenotyping of ischemic stroke into the four subtypes of the Oxfordshire Community Stroke Project classification based on structured and unstructured data from electronical medical records (EMRs). A total of 4640 adult patients who were hospitalized for acute ischemic stroke in a teaching hospital were included. In addition to the structured items in the National Institutes of Health Stroke Scale, unstructured clinical narratives were preprocessed using MetaMap to identify medical concepts, which were then encoded into feature vectors. Various supervised machine learning algorithms were used to build classifiers. The study results indicate that textual information from EMRs could facilitate phenotyping of ischemic stroke when this information was combined with structured information. Furthermore, decomposition of this multi-class problem into binary classification tasks followed by aggregation of classification results could improve the performance.
- Subjects :
- Male
Feature vector
MEDLINE
030204 cardiovascular system & hematology
Machine learning
computer.software_genre
03 medical and health sciences
0302 clinical medicine
Health Information Management
Data Mining
Electronic Health Records
Humans
Medicine
Diagnosis, Computer-Assisted
Electrical and Electronic Engineering
Stroke
Aged
Ischemic Stroke
Natural Language Processing
business.industry
Medical record
Unstructured data
medicine.disease
Medical research
Computer Science Applications
Binary classification
Task analysis
Female
Supervised Machine Learning
Artificial intelligence
business
computer
Algorithms
030217 neurology & neurosurgery
Biotechnology
Subjects
Details
- ISSN :
- 21682208 and 21682194
- Volume :
- 24
- Database :
- OpenAIRE
- Journal :
- IEEE Journal of Biomedical and Health Informatics
- Accession number :
- edsair.doi.dedup.....a2960e41c0ee725093ca035bb5d72c0b