1. Sentiment analysis in Twitter
- Author
-
Xiao, Wenhan FYTGS and Xiao, Wenhan FYTGS
- Abstract
The emergence of micro-blogging services changes the form that people share information on the Web. Its rapidly growing worldwide popularity makes the notable social networking services like Twitter a potentially large information base. People use tweets to share opinions and sentiments about what is going on around them, and sentiment analysis of these short informal texts is now attracting increasing attentions. However, some distinct characteristics of tweets, such as the creative spelling and punctuation, genre-specific terminology, bring new challenges, more than simply applying the traditional information extraction technologies that have been proved successful in the Web corpus. In this thesis, we design a sentiment analysis system based on Support Vector Machine classication model, leveraging a variety of stylistic, lexical, and syntactic feature. With external resources like Tweet-NLP and emoticon dictionary, we propose a tweet-specific preprocessing method to handle the informal text genres of tweets. Besides, in order to extract contextual interactions among words for sentiment analysis, we incorporate dependency paring by Stanford Parser in our system, which give our system competitive advantage in SemEval-2015 Task10: Sentiment Analysis in Twitter. Our system placed sixth in the message-level task on the Twitter2015 test set, obtaining a macro-averaged F-score of 63.00. Finally, we compare the performance of classifier with different feature combinations by ablation experiments, and the results reveal that the syntactic feature and the lexical feature based on automatic tweet-specific sentiment lexicons are the most influential feature groups in our sentiment analysis system, providing gains of 2-7 percentage points on test datasets.
- Published
- 2015