Back to Search Start Over

EDTBERT: Event Detection and Tracking in Twitter using Graph Clustering and Pre-trained Language Model.

Authors :
Pradhan, Abhaya Kumar
Mohanty, Hrushikesha
Lal, Rajendra Prasad
Source :
Procedia Computer Science; 2024, Vol. 233, p481-491, 11p
Publication Year :
2024

Abstract

The identification of events from social media platforms such as Twitter (now known as X) is a hot research problem. It has applications in diverse domains such as journalism, marketing, public safety, crisis management and disaster response. The process includes the identification, monitoring, and analysis of events or incidents while they are being discussed or reported on Twitter. When it comes to identifying events from tweets (i.e. feeds from Twitter), many of the currently available event detection methods mainly rely on keyword burstiness features or structural changes in the network. However, due to the intricate characteristics of tweets and the ever-changing nature of events, they frequently fail to recognise noteworthy occurrences before they become trending. Moreover, these methods face difficulties when it comes to capturing the evolving characteristics of events with limited or insufficient contextual information. In this paper, we propose a window-based tweet-processing method called EDTBERT for detecting events and tracking the evolution of events over time. Our proposed method utilizes the structural and semantic affinities that exist among words in tweets. The method starts by generating graph of tweets, where tweets are represented as nodes, and edges are the similarities between tweets. The method utilizes overlapping hashtags and named entities to capture the structural relationship between tweets. Additionally, a pre-trained sentence transformer model, specifically BERT, is employed to collect contextual knowledge and find semantically similar tweets. Next, the graph clustering technique is employed to identify optimized event clusters. Subsequently, our method generates chain of event clusters for each event to track the evolving variation of the event over time by utilising the "Maximum-Weight Bipartite Graph Matching" (MWBGM) algorithm. The effectiveness of our approach is assessed using standard Tweet Datasets. Our evaluation demonstrates that our approach outperforms the baseline approaches. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
233
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
176500396
Full Text :
https://doi.org/10.1016/j.procs.2024.03.238