Back to Search
Start Over
EDTBERT: Event Detection and Tracking in Twitter using Graph Clustering and Pre-trained Language Model.
- Source :
- Procedia Computer Science; 2024, Vol. 233, p481-491, 11p
- Publication Year :
- 2024
-
Abstract
- The identification of events from social media platforms such as Twitter (now known as X) is a hot research problem. It has applications in diverse domains such as journalism, marketing, public safety, crisis management and disaster response. The process includes the identification, monitoring, and analysis of events or incidents while they are being discussed or reported on Twitter. When it comes to identifying events from tweets (i.e. feeds from Twitter), many of the currently available event detection methods mainly rely on keyword burstiness features or structural changes in the network. However, due to the intricate characteristics of tweets and the ever-changing nature of events, they frequently fail to recognise noteworthy occurrences before they become trending. Moreover, these methods face difficulties when it comes to capturing the evolving characteristics of events with limited or insufficient contextual information. In this paper, we propose a window-based tweet-processing method called EDTBERT for detecting events and tracking the evolution of events over time. Our proposed method utilizes the structural and semantic affinities that exist among words in tweets. The method starts by generating graph of tweets, where tweets are represented as nodes, and edges are the similarities between tweets. The method utilizes overlapping hashtags and named entities to capture the structural relationship between tweets. Additionally, a pre-trained sentence transformer model, specifically BERT, is employed to collect contextual knowledge and find semantically similar tweets. Next, the graph clustering technique is employed to identify optimized event clusters. Subsequently, our method generates chain of event clusters for each event to track the evolving variation of the event over time by utilising the "Maximum-Weight Bipartite Graph Matching" (MWBGM) algorithm. The effectiveness of our approach is assessed using standard Tweet Datasets. Our evaluation demonstrates that our approach outperforms the baseline approaches. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 18770509
- Volume :
- 233
- Database :
- Supplemental Index
- Journal :
- Procedia Computer Science
- Publication Type :
- Academic Journal
- Accession number :
- 176500396
- Full Text :
- https://doi.org/10.1016/j.procs.2024.03.238