1. Predicting entity mentions in scientific literature
- Author
-
Zheng, Yalung, Ezeiza, Jon, Farzanehpour, Mehdi, Urbani, Jacopo, Gray, Alasdair J.G., Janowicz, Krzysztof, Hammar, Karl, Hitzler, Pascal, Fernández, Miriam, Lopez, Vanessa, Haller, Armin, Zaveri, Amrapali, Gray, Alasdair J.G., Janowicz, Krzysztof, Hammar, Karl, Hitzler, Pascal, Fernández, Miriam, Lopez, Vanessa, Haller, Armin, Zaveri, Amrapali, Computer Systems, Network Institute, and High Performance Distributed Computing
- Subjects
Information retrieval ,Artificial neural network ,Computer science ,010401 analytical chemistry ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,02 engineering and technology ,Scientific literature ,Paywall ,01 natural sciences ,Value (mathematics) ,0104 chemical sciences ,Task (project management) - Abstract
Predicting which entities are likely to be mentioned in scientific articles is a task with significant academic and commercial value. For instance, it can lead to monetary savings if the articles are behind paywalls, or be used to recommend articles that are not yet available. Despite extensive prior work on entity prediction in Web documents, the peculiarities of scientific literature make it a unique scenario for this task. In this paper, we present an approach that uses a neural network to predict whether the (unseen) body of an article contains entities defined in domain-specific knowledge bases (KBs). The network uses features from the abstracts and the KB, and it is trained using open-access articles and authors’ prior works. Our experiments on biomedical literature show that our method is able to predict subsets of entities with high accuracy. As far as we know, our method is the first of its kind and is currently used in several commercial settings.
- Published
- 2019