Back to Search Start Over

Domain-Specific Chinese Word Segmentation with Document-Level Optimization

Authors :
Qian Yan
Shoushan Li
Zekai Du
Chenlin Shen
Fen Xia
Source :
Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
Publication Year :
2018
Publisher :
Springer International Publishing, 2018.

Abstract

Previous studies normally formulate Chinese word segmentation as a character sequence labeling task and optimize the solution in sentence-level. In this paper, we address Chinese word segmentation as a document-level optimization problem. First, we apply a state-of-the-art approach, i.e., long short-term memory (LSTM), to perform character classification; Then, we propose a global objective function on the basis of character classification and achieve global optimization via Integer Linear Programming (ILP). Specifically, we propose several kinds of global constrains in ILP to capture various segmentation knowledge, such as segmentation consistency and domain-specific regulations, to achieve document-level optimization, besides label transition knowledge to achieve sentence-level optimization. Empirical studies demonstrate the effectiveness of the proposed approach to domain-specific Chinese word segmentation.

Details

ISBN :
978-3-319-73617-4
ISBNs :
9783319736174
Database :
OpenAIRE
Journal :
Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
Accession number :
edsair.doi...........964e8e753f2e8224170fa2aa15537664
Full Text :
https://doi.org/10.1007/978-3-319-73618-1_30