Back to Search Start Over

A Semantic Representation Enhancement Method for Chinese News Headline Classification

Authors :
Chengsen Ru
Yin Zhongbo
Wei Luo
Zhunchen Luo
Jintao Tang
Xiaolei Ma
Source :
Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
Publication Year :
2018
Publisher :
Springer International Publishing, 2018.

Abstract

Recently there has been an increasing research interest in short text such as news headline. Due to the inherent sparsity of short text, the current text classification methods perform badly when applied to the classification of news headlines. To overcome this problem, a novel method which enhances the semantic representation of headlines is proposed in this paper. Firstly, we add some keywords extracted from the most similar news to expand the word features. Secondly, we use the corpus in news domain to pre-train the word embedding so as to enhance the word representation. Moreover, Fasttext classifier, which uses a liner method to classify text with fast speed and high accuracy, is adopted for news headline classification. On the task for Chinese news headline categorization in NLPCC2017, the proposed method achieved 83.1% of the F-measure, which got the first rank in 33 teams.

Details

ISBN :
978-3-319-73617-4
ISBNs :
9783319736174
Database :
OpenAIRE
Journal :
Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
Accession number :
edsair.doi...........b0be4e25a970551df971bd9cda5dbb92