Back to Search Start Over

Closed-domain event extraction for hard news event monitoring: a systematic study.

Authors :
Dukić, David
Došilović, Filip Karlo
Pluščec, Domagoj
Šnajder, Jan
Source :
PeerJ Computer Science; Oct2024, p1-30, 30p
Publication Year :
2024

Abstract

News event monitoring systems allow real-time monitoring of a large number of events reported in the news, including the urgent and critical events comprising the so-called hard news. These systems heavily rely on natural language processing (NLP) to perform automatic event extraction at scale. While state-of-the-art event extraction models are readily available, integrating them into a news event monitoring system is not as straightforward as it seems due to practical issues related to model selection, robustness, and scale. To address this gap, we present a study on the practical use of event extraction models for news event monitoring. Our study focuses on the key task of closed-domain main event extraction (CDMEE), which aims to determine the type of the story's main event and extract its arguments from the text. We evaluate a range of state-of-the-art NLP models for this task, including those based on pre-trained language models. Aiming at a more realistic evaluation than done in the literature, we introduce a new dataset manually labeled with event types and their arguments. Additionally, we assess the scalability of CDMEE models and analyze the trade-off between accuracy and inference speed. Our results give insights into the performance of state-of-the-art NLP models on the CDMEE task and provide recommendations for developing effective, robust, and scalable news event monitoring systems. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
23765992
Database :
Complementary Index
Journal :
PeerJ Computer Science
Publication Type :
Academic Journal
Accession number :
180806821
Full Text :
https://doi.org/10.7717/peerj-cs.2355