Back to Search Start Over

Chinese and Thai Bilingual Topic Detection Online

Authors :
Rang Ziqiang
Zhou Lanjiang
Zhang Jinpeng
Xian Yantuan
Yu Zhengtao
Source :
MATEC Web of Conferences, Vol 100, p 02055 (2017)
Publication Year :
2017
Publisher :
EDP Sciences, 2017.

Abstract

Bilingual topic detection is a vital application of natural language processing in the Internet plus Era and trend of economic globalization. At present, the method of bilingual topic detection can’t solve the problem of bilingual topic inconsistent distribution. Aiming at the shortcoming, this paper introduces a based on maximal clique method to find bilingual topic detection of Chinese and Thai feature words. First of all, extract the information of news with keywords of each Chinese and Thai documents through the TextRank algorithm. Next, disambiguate by means of the similarity combined with Chinese and Thai dictionary. Then, use credible association rules to cluster Chinese and Thai feature words, which generates maximal clique of bilingual topic. Finally, cluster similar maximal clique of topic to obtain the collection of final topic. According to the needs of users, the method can recommend a bilingual topic of different sizes. The test of Chinese and Thai news texts in January 2016 made good achievement. From the perspective of cross-language word clustering, the algorithm effectively solves the problem of inconsistency of bilingual topic distribution reasonably, and has the advantages of no need to estimate the number of topics and low time complexity, so it is suitable for the application of online discovery in ilingual topic.

Details

Language :
English, French
ISSN :
2261236X
Volume :
100
Database :
Directory of Open Access Journals
Journal :
MATEC Web of Conferences
Publication Type :
Academic Journal
Accession number :
edsdoj.749c6d53e0b415f846498547b53b129
Document Type :
article
Full Text :
https://doi.org/10.1051/matecconf/201710002055