Back to Search
Start Over
A Semantic Representation Enhancement Method for Chinese News Headline Classification
- Source :
- Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
- Publication Year :
- 2018
- Publisher :
- Springer International Publishing, 2018.
-
Abstract
- Recently there has been an increasing research interest in short text such as news headline. Due to the inherent sparsity of short text, the current text classification methods perform badly when applied to the classification of news headlines. To overcome this problem, a novel method which enhances the semantic representation of headlines is proposed in this paper. Firstly, we add some keywords extracted from the most similar news to expand the word features. Secondly, we use the corpus in news domain to pre-train the word embedding so as to enhance the word representation. Moreover, Fasttext classifier, which uses a liner method to classify text with fast speed and high accuracy, is adopted for news headline classification. On the task for Chinese news headline categorization in NLPCC2017, the proposed method achieved 83.1% of the F-measure, which got the first rank in 33 teams.
- Subjects :
- Word embedding
Computer science
business.industry
Rank (computer programming)
020206 networking & telecommunications
Headline
02 engineering and technology
computer.software_genre
Task (project management)
Domain (software engineering)
Categorization
Classifier (linguistics)
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Word (computer architecture)
Natural language processing
Subjects
Details
- ISBN :
- 978-3-319-73617-4
- ISBNs :
- 9783319736174
- Database :
- OpenAIRE
- Journal :
- Natural Language Processing and Chinese Computing ISBN: 9783319736174, NLPCC
- Accession number :
- edsair.doi...........b0be4e25a970551df971bd9cda5dbb92