1. Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements
- Author
-
Basili, R, Lembo, D, Limongelli, C, Orlandini, A, Pozzi, R, Rubini, R, Bernasconi, C, Palmonari, M, Pozzi R., Rubini R., Bernasconi C., Palmonari M., Basili, R, Lembo, D, Limongelli, C, Orlandini, A, Pozzi, R, Rubini, R, Bernasconi, C, Palmonari, M, Pozzi R., Rubini R., Bernasconi C., and Palmonari M.
- Abstract
The extraction of named entities from court judgments is useful in several downstream applications, such as document anonymization and semantic search engines. In this paper, we discuss the application of named entity recognition and linking (NEEL) to extract entities from Italian civil court judgments. To develop and evaluate our work, we use a corpus of 146 manually annotated court judgments. We use a pipeline that combines a transformer-based Named Entity Recognition (NER) component, a transformer-based Named Entity Linking (NEL) component, and a NIL prediction component. While the NEL and NIL prediction components are not fine-tuned on domain-specific data, the NER component is fine-tuned on the annotated corpus. In addition, we compare different masked language modeling (MLM) adaptation strategies to optimize the result and investigate their impact. Results obtained on a 30-document test set reveal satisfactory performance, especially on the NER task, and emphasize challenges to improve NEEL on similar documents. Our code is available on GitHub.(https://github.com/rpo19/pozzi_aixia_2023. We are not allowed to publish sensitive data and the NER models trained on sensitive data.)
- Published
- 2023