Back to Search Start Over

A task set proposal for automatic protest information collection across multiple countries

Authors :
Osman Mutlu
Erdem Yörük
Deniz Yuret
Burak Gürel
Çağrı Yoltar
Fırat Duruşan
Ali Hürriyetoğlu
Hürriyetoğlu, Ali
Yörük, Erdem (ORCID 0000-0002-4882-0812 & YÖK ID 28982)
Yoltar, Çağrı
Yüret, Deniz (ORCID 0000-0002-7039-0046 & YÖK ID 179996)
Gürel, Burak (ORCID 0000-0002-1666-8748 & YÖK ID 219277)
Duruşan, Fırat
Mutlu, Osman
Graduate School of Social Sciences and Humanities
Graduate School of Sciences and Engineering
Department of Sociology
Department of Computer Science and Engineering
Source :
Lecture Notes in Computer Science, Lecture Notes in Computer Science ISBN: 9783030157180, ECIR (2), Advances in Information Retrieval-41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part II, Lecture Notes in Computer Science-Advances in Information Retrieval
Publication Year :
2019
Publisher :
Springer, 2019.

Abstract

We propose a coherent set of tasks for protest information collection in the context of generalizable natural language processing. The tasks are news article classification, event sentence detection, and event extraction. Having tools for collecting event information from data produced in multiple countries enables comparative sociology and politics studies. We have annotated news articles in English from a source and a target country in order to be able to measure the performance of the tools developed using data from one country on data from a different country. Our preliminary experiments have shown that the performance of the tools developed using English texts from India drops to a level that are not usable when they are applied on English texts from China. We think our setting addresses the challenge of building generalizable NLP tools that perform well independent of the source of the text and will accelerate progress in line of developing generalizable NLP systems.<br />European Research Council (ERC) Starting Grant; European Union (EU); Horizon 2020

Details

Language :
English
ISBN :
978-3-030-15718-0
978-3-030-15719-7
ISSN :
03029743 and 16113349
ISBNs :
9783030157180 and 9783030157197
Database :
OpenAIRE
Journal :
Lecture Notes in Computer Science, Lecture Notes in Computer Science ISBN: 9783030157180, ECIR (2), Advances in Information Retrieval-41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part II, Lecture Notes in Computer Science-Advances in Information Retrieval
Accession number :
edsair.doi.dedup.....8b24675af2bba0bab4350d93894cc72a