1. CIDER: Context-sensitive polarity measurement for short-form text.
- Author
-
Young, James C., Arthur, Rudy, and Williams, Hywel T. P.
- Subjects
- *
SENTIMENT analysis , *LINGUISTIC context , *HEADLINES , *LINGUISTIC analysis , *CIDER (Alcoholic beverage) , *ENCYCLOPEDIAS & dictionaries - Abstract
Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF