Cross-lingual sentiment classification: Similarity discovery plus training data adjustment.

Authors :: Zhang, Peng
Wang, Suge
Li, Deyu
Source :: Knowledge-Based Systems. Sep2016, Vol. 107, p129-141. 13p.
Publication Year :: 2016
Abstract: The performance of cross-lingual sentiment classification is sharply limited by the language gap, which means that each language has its own ways to express sentiments. Many methods have been designed to transmit sentiment information across languages by making use of machine translation, parallel corpora, auxiliary unlabeled samples and other resources. In this paper, a new approach is proposed based on the selection of training data, where labeled samples highly similar to the target language are put into the training set. The refined training samples are used to build up an effective cross-lingual sentiment classifier focusing on the target language. The proposed approach contains two major strategies: the aligned-translation topic model and the semi-supervised training data adjustment. The aligned-translation topic model provides a cross-language representation space in which the semi-supervised training data adjustment procedure attempts to select effective training samples to eliminate the negative influence of the semantic distribution differences between the original and target languages. The experiments show that the proposed approach is feasible for cross-language sentiment classification tasks and provides insight into the semantic relationship between two different languages. [ABSTRACT FROM AUTHOR]

Subjects :: *SENTIMENT analysis
*MACHINE translating
*CROSS-language information retrieval
*SEMANTIC computing
*COMPUTER hardware description languages

Full Text Access

Tools