1. Two-Level LSTM for Sentiment Analysis With Lexicon Embedding and Polar Flipping
- Author
-
Mengyang Li, Tao Yang, Ming Li, and Ou Wu
- Subjects
Computer science ,Context (language use) ,Lexicon ,computer.software_genre ,Text mining ,Classifier (linguistics) ,Sentiment Analysis ,Data Mining ,Electrical and Electronic Engineering ,business.industry ,Sentiment analysis ,Semantics ,Computer Science Applications ,Human-Computer Interaction ,Control and Systems Engineering ,Benchmark (computing) ,Embedding ,Artificial intelligence ,business ,computer ,Algorithms ,Software ,Natural language processing ,Word (computer architecture) ,Sentence ,Information Systems - Abstract
Sentiment analysis is a key component in various text mining applications. Numerous sentiment classification techniques, including conventional and deep-learning-based methods, have been proposed in the literature. In most existing methods, a high-quality training set is assumed to be given. Nevertheless, constructing a high-quality training set that consists of highly accurate labels is challenging in real applications. This difficulty stems from the fact that text samples usually contain complex sentiment representations, and their annotation is subjective. We address this challenge in this study by leveraging a new labeling strategy and utilizing a two-level long short-term memory network to construct a sentiment classifier. Lexical cues are useful for sentiment analysis, and they have been utilized in conventional studies. For example, polar and negation words play important roles in sentiment analysis. A new encoding strategy, that is, ρ -hot encoding, is proposed to alleviate the drawbacks of one-hot encoding and, thus, effectively incorporate useful lexical cues. Moreover, the sentimental polarity of a word may change in different sentences due to label noise or context. A flipping model is proposed to model the polar flipping of words in a sentence. We compile three Chinese datasets on the basis of our label strategy and proposed methodology. Experiments demonstrate that the proposed method outperforms state-of-the-art algorithms on both benchmark English data and our compiled Chinese data.
- Published
- 2022