Back to Search
Start Over
Don't Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
- Source :
- [Proceedings of the] Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 3034-3042, STARTPAGE=3034;ENDPAGE=3042;TITLE=[Proceedings of the] Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
- Publication Year :
- 2018
- Publisher :
- LREC, 2018.
-
Abstract
- In this paper, we present a new method to obtain large volumes of high-quality text corpora with event data for studying identity and reference relations. We report on the current methods to create event reference data by annotating texts and deriving the event data a posteriori. Our method starts from event registries in which event data is defined a priori. From this data, we extract so-called Microworlds of referential data with the Reference Texts that report on these events. This makes it possible to easily establish referential relations with high precision and at a large scale. In a pilot, we successfully obtained data from these resources with extreme ambiguity and variation, while maintaining the identity and reference relations and without having to annotate large quantities of texts word-by-word. The data from this pilot was annotated using an annotation tool created specifically in order to validate our method and to enrich the reference texts with event coreference annotations. This annotation process resulted in the Gun Violence Corpus, whose development process and outcome are described in this paper.
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- [Proceedings of the] Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 3034-3042, STARTPAGE=3034;ENDPAGE=3042;TITLE=[Proceedings of the] Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
- Accession number :
- edsair.narcis........be5885c774ec677956b7a84f649cdc0b