Back to Search Start Over

Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations

Authors :
Greiner-Petter, Andre
Schubotz, Moritz
Mueller, Fabian
Breitinger, Corinna
Cohl, Howard S.
Aizawa, Akiko
Gipp, Bela
Publication Year :
2020

Abstract

Mathematical notation, i.e., the writing system used to communicate concepts in mathematics, encodes valuable information for a variety of information search and retrieval systems. Yet, mathematical notations remain mostly unutilized by today's systems. In this paper, we present the first in-depth study on the distributions of mathematical notation in two large scientific corpora: the open access arXiv (2.5B mathematical objects) and the mathematical reviewing service for pure and applied mathematics zbMATH (61M mathematical objects). Our study lays a foundation for future research projects on mathematical information retrieval for large scientific corpora. Further, we demonstrate the relevance of our results to a variety of use-cases. For example, to assist semantic extraction systems, to improve scientific search engines, and to facilitate specialized math recommendation systems. The contributions of our presented research are as follows: (1) we present the first distributional analysis of mathematical formulae on arXiv and zbMATH; (2) we retrieve relevant mathematical objects for given textual search queries (e.g., linking $P_{n}^{(\alpha, \beta)}\!\left(x\right)$ with `Jacobi polynomial'); (3) we extend zbMATH's search engine by providing relevant mathematical formulae; and (4) we exemplify the applicability of the results by presenting auto-completion for math inputs as the first contribution to math recommendation systems. To expedite future research projects, we have made available our source code and data.<br />Comment: Proceedings of The Web Conference 2020 (WWW'20), April 20--24, 2020, Taipei, Taiwan

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2002.02712
Document Type :
Working Paper
Full Text :
https://doi.org/10.1145/3366423.3380218