Back to Search Start Over

Deep learning-based natural language processing for detecting medical symptoms and histories in emergency patient triage.

Authors :
Lee, Siryeol
Lee, Juncheol
Park, Juntae
Park, Jiwoo
Kim, Dohoon
Lee, Joohyun
Oh, Jaehoon
Source :
American Journal of Emergency Medicine; Mar2024, Vol. 77, p29-38, 10p
Publication Year :
2024

Abstract

The manual recording of electronic health records (EHRs) by clinicians in the emergency department (ED) is time-consuming and challenging. In light of recent advancements in large language models (LLMs) such as GPT and BERT, this study aimed to design and validate LLMs for automatic clinical diagnoses. The models were designed to identify 12 medical symptoms and 2 patient histories from simulated clinician–patient conversations within 6 primary symptom scenarios in emergency triage rooms. We developed classification models by fine-tuning BERT, a transformer-based pre-trained model. We subsequently analyzed these models using eXplainable artificial intelligence (XAI) and the Shapley additive explanation (SHAP) method. A Turing test was conducted to ascertain the reliability of the XAI results by comparing them to the outcomes of tasks performed and explained by medical workers. An emergency medicine specialist assessed the results of both XAI and the medical workers. We fine-tuned four pre-trained LLMs and compared their classification performance. The KLUE-RoBERTa-based model demonstrated the highest performance (F1-score: 0.965, AUROC: 0.893) on human-transcribed script data. The XAI results using SHAP showed an average Jaccard similarity of 0.722 when compared with explanations of medical workers for 15 samples. The Turing test results revealed a small 6% gap, with XAI and medical workers receiving the mean scores of 3.327 and 3.52, respectively. This paper highlights the potential of LLMs for automatic EHR recording in Korean EDs. The KLUE-RoBERTa-based model demonstrated superior classification performance. Furthermore, XAI using SHAP provided reliable explanations for model outputs. The reliability of these explanations was confirmed by a Turing test. • The data was collected from simulated clinician-patient conversations. • The fine-tuned large language model identifies medical information included in electronic health records. • The outcomes of the model were interpreted through eXplainable AI. • The Turing test was conducted to demonstrate the reliability of the eXplainable AI results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
07356757
Volume :
77
Database :
Supplemental Index
Journal :
American Journal of Emergency Medicine
Publication Type :
Academic Journal
Accession number :
175392112
Full Text :
https://doi.org/10.1016/j.ajem.2023.11.063