Back to Search
Start Over
Chinese Named Entity Recognition Method for Domain-Specific Text
- Source :
- Tehnički Vjesnik, Vol 30, Iss 6, Pp 1799-1808 (2023)
- Publication Year :
- 2023
- Publisher :
- Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek, 2023.
-
Abstract
- The Chinese named entity recognition (NER) is a critical task in natural language processing, aiming at identifying and classifying named entities in text. However, the specificity of domain texts and the lack of large-scale labelled datasets have led to the poor performance of NER methods trained on public domain corpora on domain texts. In this paper, a named entity recognition method incorporating sentence semantic information is proposed, mainly by adaptively incorporating sentence semantic information into character semantic information through an attention mechanism and a gating mechanism to enhance entity feature representation while attenuating the noise generated by irrelevant character information. In addition, to address the lack of large-scale labelled samples, we used data self-augmentation methods to expand the training samples. Furthermore, we introduced a Weighted Strategy considering that the low-quality samples generated by the data self-augmentation process can have a negative impact on the model. Experiments on the TCM prescriptions corpus showed that the F1 values of our method outperformed the comparison methods.
Details
- Language :
- English
- ISSN :
- 13303651 and 18486339
- Volume :
- 30
- Issue :
- 6
- Database :
- Directory of Open Access Journals
- Journal :
- Tehnički Vjesnik
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.fd5c675e5056442c8bd74b3e9307e0bc
- Document Type :
- article
- Full Text :
- https://doi.org/10.17559/TV-20230324000477