Back to Search Start Over

A Two-Level Place Names Identification Based on the N-Shortest Path and CRFs

Authors :
Xin-fu Li
Mei Zheng
Jian-feng Shi
Source :
2009 International Conference on Information Management, Innovation Management and Industrial Engineering.
Publication Year :
2009
Publisher :
IEEE, 2009.

Abstract

This paper presents a two-level place names identification method based on N-shortest path and Conditional Random Fields(CRFs) aiming at solving the low recall rate problem in Chinese place names identification. First, the rough segmentation method based on N-shortest path is used to improve the recall rate of Chinese place names identification at low level; Second, the result of rough segmentation is submitted to high level as one of the features of high-level place names identification. High level’ s CRFs model uses the feature which submitted by low level, single and complex features of place names words to tag the text. Adding the complex feature is conducive to mine the context information and improve accuracy rate of place names identification, and the result of text tagging could be combined with rules to identify place names finally. This two-level model ensures a high recall rate and improves the accuracy rate. During experiment, choose the mature corpora of People’ s Daily in January 1998 as training samples, which include 3128 place names(except duplicate names), and extract articles of People’ s Daily in 2003 randomly to carry out the test. Experimental results achieve a high recall rate, and this method is proved to be practical and effective.

Details

Database :
OpenAIRE
Journal :
2009 International Conference on Information Management, Innovation Management and Industrial Engineering
Accession number :
edsair.doi...........4474d159f41ff2fa05f90b74b005fa5f
Full Text :
https://doi.org/10.1109/iciii.2009.426