Back to Search
Start Over
Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations
- Publication Year :
- 2020
-
Abstract
- Mathematical notation, i.e., the writing system used to communicate concepts in mathematics, encodes valuable information for a variety of information search and retrieval systems. Yet, mathematical notations remain mostly unutilized by today's systems. In this paper, we present the first in-depth study on the distributions of mathematical notation in two large scientific corpora: the open access arXiv (2.5B mathematical objects) and the mathematical reviewing service for pure and applied mathematics zbMATH (61M mathematical objects). Our study lays a foundation for future research projects on mathematical information retrieval for large scientific corpora. Further, we demonstrate the relevance of our results to a variety of use-cases. For example, to assist semantic extraction systems, to improve scientific search engines, and to facilitate specialized math recommendation systems. The contributions of our presented research are as follows: (1) we present the first distributional analysis of mathematical formulae on arXiv and zbMATH; (2) we retrieve relevant mathematical objects for given textual search queries (e.g., linking $P_{n}^{(\alpha, \beta)}\!\left(x\right)$ with `Jacobi polynomial'); (3) we extend zbMATH's search engine by providing relevant mathematical formulae; and (4) we exemplify the applicability of the results by presenting auto-completion for math inputs as the first contribution to math recommendation systems. To expedite future research projects, we have made available our source code and data.<br />Comment: Proceedings of The Web Conference 2020 (WWW'20), April 20--24, 2020, Taipei, Taiwan
- Subjects :
- Computer Science - Digital Libraries
Computer Science - Information Retrieval
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2002.02712
- Document Type :
- Working Paper
- Full Text :
- https://doi.org/10.1145/3366423.3380218