Back to Search Start Over

Chinese Named Entity Recognition Method for Domain-Specific Text

Authors :
He Liu
Yuekun Ma
Chang Gao
Jia Qi
Dezheng Zhang
Source :
Tehnički Vjesnik, Vol 30, Iss 6, Pp 1799-1808 (2023)
Publication Year :
2023
Publisher :
Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek, 2023.

Abstract

The Chinese named entity recognition (NER) is a critical task in natural language processing, aiming at identifying and classifying named entities in text. However, the specificity of domain texts and the lack of large-scale labelled datasets have led to the poor performance of NER methods trained on public domain corpora on domain texts. In this paper, a named entity recognition method incorporating sentence semantic information is proposed, mainly by adaptively incorporating sentence semantic information into character semantic information through an attention mechanism and a gating mechanism to enhance entity feature representation while attenuating the noise generated by irrelevant character information. In addition, to address the lack of large-scale labelled samples, we used data self-augmentation methods to expand the training samples. Furthermore, we introduced a Weighted Strategy considering that the low-quality samples generated by the data self-augmentation process can have a negative impact on the model. Experiments on the TCM prescriptions corpus showed that the F1 values of our method outperformed the comparison methods.

Details

Language :
English
ISSN :
13303651 and 18486339
Volume :
30
Issue :
6
Database :
Directory of Open Access Journals
Journal :
Tehnički Vjesnik
Publication Type :
Academic Journal
Accession number :
edsdoj.fd5c675e5056442c8bd74b3e9307e0bc
Document Type :
article
Full Text :
https://doi.org/10.17559/TV-20230324000477