Back to Search Start Over

LATTE: A knowledge-based method to normalize various expressions of laboratory test results in free text of Chinese electronic health records

Authors :
Chunyan Wu
Luming Chen
Kun Jiang
Taijiao Jiang
Yongyou Wu
Longfei Mao
Tao Yang
Lizong Deng
Source :
Journal of Biomedical Informatics. 102:103372
Publication Year :
2020
Publisher :
Elsevier BV, 2020.

Abstract

Background A wealth of clinical information is buried in free text of electronic health records (EHR), and converting clinical information to machine-understandable form is crucial for the secondary use of EHRs. Laboratory test results, as one of the most important types of clinical information, are written in various styles in free text of EHRs. This has brought great difficulties for data integration and utilization of EHRs. Therefore, developing technology to normalize different expressions of laboratory test results in free text is indispensable for the secondary use of EHRs. Methods In this study, we developed a knowledge-based method named LATTE (transforming lab test results), which could transform various expressions of laboratory test results into a normalized and machine-understandable format. We first identified the analyte of a laboratory test result with a dictionary-based method and then designed a series of rules to detect information associated with the analyte, including its specimen, measured value, unit of measure, conclusive phrase and sampling factor. We determined whether a test result is normal or abnormal by understanding the meaning of conclusive phrases or by comparing its measured value with an appropriate normal range. Finally, we converted various expressions of laboratory test results, either in numeric or textual form, into a normalized form as “specimen-analyte-abnormality”. With this method, a laboratory test with the same type of abnormality would have the same representation, regardless of the way that it is mentioned in free text. Results LATTE was developed and optimized on a training set including 8894 laboratory test results from 756 EHRs, and evaluated on a test set including 3740 laboratory test results from 210 EHRs. Compared to experts’ annotations, LATTE achieved a precision of 0.936, a recall of 0.897 and an F1 score of 0.916 on the training set, and a precision of 0.892, a recall of 0.843 and an F1 score of 0.867 on the test set. For 223 laboratory tests with at least two different expression forms in the test set, LATTE transformed 85.7% (2870/3350) of laboratory test results into a normalized form. Besides, LATTE achieved F1 scores above 0.8 for EHRs from 18 of 21 different hospital departments, indicating its generalization capabilities in normalizing laboratory test results. Conclusion In conclusion, LATTE is an effective method for normalizing various expressions of laboratory test results in free text of EHRs. LATTE will facilitate EHR-based applications such as cohort querying, patient clustering and machine learning. Availability LATTE is freely available for download on GitHub ( https://github.com/denglizong/LATTE ).

Details

ISSN :
15320464
Volume :
102
Database :
OpenAIRE
Journal :
Journal of Biomedical Informatics
Accession number :
edsair.doi.dedup.....a6f87c178f0a46cdf60f30bed907c725
Full Text :
https://doi.org/10.1016/j.jbi.2019.103372