Back to Search Start Over

Topic-based automatic summarization algorithm for Chinese short text.

Authors :
Ma TH
Wang HM
Zhao YW
Tian Y
Al-Nabhan N
Source :
Mathematical biosciences and engineering : MBE [Math Biosci Eng] 2020 May 12; Vol. 17 (4), pp. 3582-3600.
Publication Year :
2020

Abstract

Most current automatic summarization methods are for English texts. The distinction between words in Chinese text is large, the types of parts of speech are many and complex, and polysemy or ambiguous words appear frequently. Therefore, compared with English text, Chinese text is more difficult to extract useful feature words. Due to the complex syntax of Chinese, there are currently relatively few automatic summarization methods for Chinese text. In the past, only the important sentences in the original text can be selected and simply arranged to obtain a summary with chaotic sentences and insufficient coherence. Meanwhile, because Chinese short text usually contains more redundant information and the sentence structure is not neat, we propose a topic-based automatic summary method for Chinese short text. Firstly, a key sentence selection method is proposed combining topic words and TF-IDF to obtain the score of each text corresponding to the topic in the original text data. Then the sentence with the highest score as the topic sentence of the topic is selected. Considering that the short text of Weibo may contain a lot of irrelevant information and sometimes even lack some important components of topic, three retouching mechanisms are proposed to improve the conciseness, richness and readability of topic sentence extraction results. We validate our approach on natural disaster and social hot event datasets from Sina Weibo. The experimental results show that the polished topic summary not only reflects the exact relationship between topic sentences and natural disasters or social hot events, but also has rich semantic information. More importantly, we can almost grasp the basic elements of natural disaster or social hot event from the topic sentence, so as to help the government guide disaster relief or meet the needs of users for quickly obtaining information of social hot events.

Details

Language :
English
ISSN :
1551-0018
Volume :
17
Issue :
4
Database :
MEDLINE
Journal :
Mathematical biosciences and engineering : MBE
Publication Type :
Academic Journal
Accession number :
32987545
Full Text :
https://doi.org/10.3934/mbe.2020202