Back to Search
Start Over
LATTE: A knowledge-based method to normalize various expressions of laboratory test results in free text of Chinese electronic health records
- Source :
- Journal of Biomedical Informatics. 102:103372
- Publication Year :
- 2020
- Publisher :
- Elsevier BV, 2020.
-
Abstract
- Background A wealth of clinical information is buried in free text of electronic health records (EHR), and converting clinical information to machine-understandable form is crucial for the secondary use of EHRs. Laboratory test results, as one of the most important types of clinical information, are written in various styles in free text of EHRs. This has brought great difficulties for data integration and utilization of EHRs. Therefore, developing technology to normalize different expressions of laboratory test results in free text is indispensable for the secondary use of EHRs. Methods In this study, we developed a knowledge-based method named LATTE (transforming lab test results), which could transform various expressions of laboratory test results into a normalized and machine-understandable format. We first identified the analyte of a laboratory test result with a dictionary-based method and then designed a series of rules to detect information associated with the analyte, including its specimen, measured value, unit of measure, conclusive phrase and sampling factor. We determined whether a test result is normal or abnormal by understanding the meaning of conclusive phrases or by comparing its measured value with an appropriate normal range. Finally, we converted various expressions of laboratory test results, either in numeric or textual form, into a normalized form as “specimen-analyte-abnormality”. With this method, a laboratory test with the same type of abnormality would have the same representation, regardless of the way that it is mentioned in free text. Results LATTE was developed and optimized on a training set including 8894 laboratory test results from 756 EHRs, and evaluated on a test set including 3740 laboratory test results from 210 EHRs. Compared to experts’ annotations, LATTE achieved a precision of 0.936, a recall of 0.897 and an F1 score of 0.916 on the training set, and a precision of 0.892, a recall of 0.843 and an F1 score of 0.867 on the test set. For 223 laboratory tests with at least two different expression forms in the test set, LATTE transformed 85.7% (2870/3350) of laboratory test results into a normalized form. Besides, LATTE achieved F1 scores above 0.8 for EHRs from 18 of 21 different hospital departments, indicating its generalization capabilities in normalizing laboratory test results. Conclusion In conclusion, LATTE is an effective method for normalizing various expressions of laboratory test results in free text of EHRs. LATTE will facilitate EHR-based applications such as cohort querying, patient clustering and machine learning. Availability LATTE is freely available for download on GitHub ( https://github.com/denglizong/LATTE ).
- Subjects :
- China
Computer science
Knowledge Bases
Health Informatics
computer.software_genre
Machine Learning
Database normalization
03 medical and health sciences
Knowledge-based systems
0302 clinical medicine
Electronic Health Records
Humans
030212 general & internal medicine
Cluster analysis
030304 developmental biology
0303 health sciences
Clinical Laboratory Techniques
business.industry
Expression (mathematics)
Computer Science Applications
Test (assessment)
Laboratory Test Result
Test set
Artificial intelligence
business
F1 score
computer
Natural language processing
Subjects
Details
- ISSN :
- 15320464
- Volume :
- 102
- Database :
- OpenAIRE
- Journal :
- Journal of Biomedical Informatics
- Accession number :
- edsair.doi.dedup.....a6f87c178f0a46cdf60f30bed907c725
- Full Text :
- https://doi.org/10.1016/j.jbi.2019.103372