Back to Search Start Over

A unified framework of medical information annotation and extraction for Chinese clinical text.

Authors :
Zhu E
Sheng Q
Yang H
Liu Y
Cai T
Li J
Source :
Artificial intelligence in medicine [Artif Intell Med] 2023 Aug; Vol. 142, pp. 102573. Date of Electronic Publication: 2023 May 19.
Publication Year :
2023

Abstract

Medical information extraction consists of a group of natural language processing (NLP) tasks, which collaboratively convert clinical text to pre-defined structured formats. This is a critical step to exploit electronic medical records (EMRs). Given the recent thriving NLP technologies, model implementation and performance seem no longer an obstacle, whereas the bottleneck locates on a high-quality annotated corpus and the whole engineering workflow. This study presents an engineering framework consisting of three tasks, i.e., medical entity recognition, relation extraction and attribute extraction. Within this framework, the whole workflow is demonstrated from EMR data collection through model performance evaluation. Our annotation scheme is designed to be comprehensive and compatible between the multiple tasks. With the EMRs from a general hospital in Ningbo, China, and the manual annotation by experienced physicians, our corpus is of large scale and high quality. Built upon this Chinese clinical corpus, the medical information extraction system show performance that approaches human annotation. The annotation scheme, (a subset of) the annotated corpus, and the code are all publicly released, to facilitate further research.<br />Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2023 Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1873-2860
Volume :
142
Database :
MEDLINE
Journal :
Artificial intelligence in medicine
Publication Type :
Academic Journal
Accession number :
37316096
Full Text :
https://doi.org/10.1016/j.artmed.2023.102573