Back to Search Start Over

Extracting Definienda in Mathematical Scholarly Articles with Transformers

Authors :
Jiang, Shufan
Senellart, Pierre
Publication Year :
2023

Abstract

We consider automatically identifying the defined term within a mathematical definition from the text of an academic article. Inspired by the development of transformer-based natural language processing applications, we pose the problem as (a) a token-level classification task using fine-tuned pre-trained transformers; and (b) a question-answering task using a generalist large language model (GPT). We also propose a rule-based approach to build a labeled dataset from the LATEX source of papers. Experimental results show that it is possible to reach high levels of precision and recall using either recent (and expensive) GPT 4 or simpler pre-trained models fine-tuned on our task.<br />Comment: In the Proceedings of the 2nd Workshop on Information Extraction from Scientific Publications (WIESP 2023)

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2311.12448
Document Type :
Working Paper