Back to Search
Start Over
Closed-domain event extraction for hard news event monitoring: a systematic study
- Source :
- PeerJ Computer Science, Vol 10, p e2355 (2024)
- Publication Year :
- 2024
- Publisher :
- PeerJ Inc., 2024.
-
Abstract
- News event monitoring systems allow real-time monitoring of a large number of events reported in the news, including the urgent and critical events comprising the so-called hard news. These systems heavily rely on natural language processing (NLP) to perform automatic event extraction at scale. While state-of-the-art event extraction models are readily available, integrating them into a news event monitoring system is not as straightforward as it seems due to practical issues related to model selection, robustness, and scale. To address this gap, we present a study on the practical use of event extraction models for news event monitoring. Our study focuses on the key task of closed-domain main event extraction (CDMEE), which aims to determine the type of the story’s main event and extract its arguments from the text. We evaluate a range of state-of-the-art NLP models for this task, including those based on pre-trained language models. Aiming at a more realistic evaluation than done in the literature, we introduce a new dataset manually labeled with event types and their arguments. Additionally, we assess the scalability of CDMEE models and analyze the trade-off between accuracy and inference speed. Our results give insights into the performance of state-of-the-art NLP models on the CDMEE task and provide recommendations for developing effective, robust, and scalable news event monitoring systems.
Details
- Language :
- English
- ISSN :
- 23765992
- Volume :
- 10
- Database :
- Directory of Open Access Journals
- Journal :
- PeerJ Computer Science
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.fed871bbaa54f0da3eb0d918469a58f
- Document Type :
- article
- Full Text :
- https://doi.org/10.7717/peerj-cs.2355