1. OFrLex: A Computational Morphological and Syntactic Lexicon for Old French
- Author
-
Guibon, Gaël, Sagot, Benoît, Laboratoire de Linguistique Formelle (LLF - UMR7110), Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), This work was partly funded by the French national ANR grant PROFITEROLE (ANR-16-CE38-0010) headed by Sophie Prévost, as well as by the second author’s chair in the PRAIRIE institute, funded by the French national agency ANR as part of the 'Investissements d’avenir' pro-gramme under the reference ANR-19-P3IA-0001, ANR-16-CE38-0010,PROFITEROLE,Modélisation de l'évolution de la langue à partir de textes d'ancien français instrumentés(2016), ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), Laboratoire de Linguistique Formelle (LLF UMR7110), and Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP)
- Subjects
Lexicon Enrichment ,Morphological lexicon ,Old French ,Syntactic lexicon ,[SHS.LANGUE]Humanities and Social Sciences/Linguistics ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
Due to COVID19 pandemic, the 12th edition is cancelled. The LREC 2020 Proceedings are available at http://www.lrec-conf.org/proceedings/lrec2020/index.htmlThe version 2 of the paper is an updated version with regard to the originally published version (minor corrections).; International audience; In this paper we describe our work on the development and enrichment of OFrLex, a freely available, large-coverage morphological and syntactic Old French lexicon. We rely on several heterogeneous language resources to extract structured and exploitable information. The extraction follows a semi-automatic procedure with substantial manual steps to respond to difficulties encountered while aligning lexical entries from distinct language resources. OFrLex aims at improving natural language processing tasks on Old French such as part-of-speech tagging and dependency parsing. We provide quantitative information on OFrLex and discuss its reliability. We also describe and evaluate a semi-automatic, word-embedding-based lexical enrichment process aimed at increasing the accuracy of the resource. Results of this extension technique will be manually validated in the near future, a step that will take advantage of OFrLex's viewing, searching and editing interface, which is already accessible online.
- Published
- 2020