1. An approach to named entity recognition towards micro-blog
- Author
-
Li Gang and Huang Yongfeng
- Subjects
named entity recognition ,micro-blog ,conditional random field ,word representation ,active learning ,Electronics ,TK7800-8360 - Abstract
Named entity recognition is a fundamental technology in natural language processing(NLP). In recent years, rapid development of social network platforms such as microblog presents new challenges to the traditional named entity recognition(NER) technology because of the unique form. In this paper, an improved method based on the conditional random field(CRF) model is proposed for microblog texts. Due to the short texts and semantic ambiguity, external data resources are introduced to generate the topic feature and word representation feature for training the model. Due to the large-scale of microblog data and the high cost of manual standardization, an active learning algorithm based on least confidence is adopted to enhance the training effect at a lower cost of labor. Experiments on a Sina weibo data set show that this method improves the F-score by 4.54% compared to the traditional CRF methods.
- Published
- 2018
- Full Text
- View/download PDF