1. ET
- Author
-
Ruchi Parikh and Kamalakar Karlapalem
- Subjects
Scheme (programming language) ,Information retrieval ,Event (computing) ,business.industry ,Computer science ,computer.software_genre ,Hierarchical clustering ,Set (abstract data type) ,Text mining ,Key (cryptography) ,Domain knowledge ,Relevance (information retrieval) ,Data mining ,business ,computer ,computer.programming_language - Abstract
Social media sites such as Twitter and Facebook have emerged as popular tools for people to express their opinions on various topics. The large amount of data provided by these media is extremely valuable for mining trending topics and events. In this paper, we build an efficient, scalable system to detect events from tweets (ET). Our approach detects events by exploring their textual and temporal components. ET does not require any target entity or domain knowledge to be specified; it automatically detects events from a set of tweets. The key components of ET are (1) an extraction scheme for event representative keywords, (2) an efficient storage mechanism to store their appearance patterns, and (3) a hierarchical clustering technique based on the common co-occurring features of keywords. The events are determined through the hierarchical clustering process. We evaluate our system on two data-sets; one is provided by VAST challenge 2011, and the other published by US based users in January 2013. Our results show that we are able to detect events of relevance efficiently.
- Published
- 2013
- Full Text
- View/download PDF