Back to Search
Start Over
Domain-Specific Chinese Word Segmentation with Document-Level Optimization
- Source :
- Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
- Publication Year :
- 2018
- Publisher :
- Springer International Publishing, 2018.
-
Abstract
- Previous studies normally formulate Chinese word segmentation as a character sequence labeling task and optimize the solution in sentence-level. In this paper, we address Chinese word segmentation as a document-level optimization problem. First, we apply a state-of-the-art approach, i.e., long short-term memory (LSTM), to perform character classification; Then, we propose a global objective function on the basis of character classification and achieve global optimization via Integer Linear Programming (ILP). Specifically, we propose several kinds of global constrains in ILP to capture various segmentation knowledge, such as segmentation consistency and domain-specific regulations, to achieve document-level optimization, besides label transition knowledge to achieve sentence-level optimization. Empirical studies demonstrate the effectiveness of the proposed approach to domain-specific Chinese word segmentation.
- Subjects :
- Optimization problem
business.industry
Computer science
Machine learning
computer.software_genre
Sequence labeling
Domain (software engineering)
Consistency (database systems)
Character (mathematics)
Segmentation
Artificial intelligence
business
computer
Global optimization
Integer programming
Subjects
Details
- ISBN :
- 978-3-319-73617-4
- ISBNs :
- 9783319736174
- Database :
- OpenAIRE
- Journal :
- Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
- Accession number :
- edsair.doi...........964e8e753f2e8224170fa2aa15537664
- Full Text :
- https://doi.org/10.1007/978-3-319-73618-1_30