Back to Search
Start Over
Extracting Definienda in Mathematical Scholarly Articles with Transformers
- Publication Year :
- 2023
-
Abstract
- We consider automatically identifying the defined term within a mathematical definition from the text of an academic article. Inspired by the development of transformer-based natural language processing applications, we pose the problem as (a) a token-level classification task using fine-tuned pre-trained transformers; and (b) a question-answering task using a generalist large language model (GPT). We also propose a rule-based approach to build a labeled dataset from the LATEX source of papers. Experimental results show that it is possible to reach high levels of precision and recall using either recent (and expensive) GPT 4 or simpler pre-trained models fine-tuned on our task.<br />Comment: In the Proceedings of the 2nd Workshop on Information Extraction from Scientific Publications (WIESP 2023)
- Subjects :
- Computer Science - Artificial Intelligence
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2311.12448
- Document Type :
- Working Paper