Start Over

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

Authors :: Enrico Motta
Danilo Dessì
Diego Reforgiato Recupero
Francesco Osborne
Davide Buscaldi
Laboratoire d'Informatique de Paris-Nord (LIPN)
Université Sorbonne Paris Cité (USPC)-Institut Galilée-Université Paris 13 (UP13)-Centre National de la Recherche Scientifique (CNRS)
Dessì, D
Osborne, F
Reforgiato Recupero, D
Buscaldi, D
Motta, E
Source :: Future Generation Computer Systems, Future Generation Computer Systems, Elsevier, 2021, 116, pp.253-264. ⟨10.1016/j.future.2020.10.026⟩
Publication Year :: 2021
Publisher :: Elsevier BV, 2021.
Abstract: The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to help researchers, research policy makers, and companies to time-efficiently browse, analyse, and forecast scientific research. Knowledge graphs i.e., large networks of entities and relationships, have proved to be effective solution in this space. Scientific knowledge graphs focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and citations. However, the current generation of knowledge graphs lacks of an explicit representation of the knowledge presented in the research papers. As such, in this paper, we present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications and integrates them in a large-scale knowledge graph. Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools, ii) describe an approach for integrating entities and relationships generated by these tools, iii) show the advantage of such an hybrid system over alternative approaches, and vi) as a chosen use case, we generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain. As our approach is general and can be applied to any domain, we expect that it can facilitate the management, analysis, dissemination, and processing of scientific knowledge.<br />Accepted for publication in Future Generation Computer Systems journal - Special Issue on Machine Learning and Knowledge Graphs

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
Sociology of scientific knowledge
Knowledge representation and reasoning
Computer Science - Artificial Intelligence
Computer Networks and Communications
Computer science
02 engineering and technology
Scientific literature
Machine learning
computer.software_genre
Machine Learning (cs.LG)
Text mining
Knowledge extraction
0202 electrical engineering, electronic engineering, information engineering
Semantic Web
ComputingMilieux_MISCELLANEOUS
Computer Science - Computation and Language
business.industry
INF/01 - INFORMATICA
020206 networking & telecommunications
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
Metadata
Artificial Intelligence (cs.AI)
Knowledge graph
Hardware and Architecture
Hybrid system
Scientific method
ING-INF/01 - ELETTRONICA
020201 artificial intelligence & image processing
Artificial intelligence
Knowledge Graphs, Knowledge Graph Generation, Semantic Web, Information Extraction, Natural Language Processing, Artificial Intelligence
business
Computation and Language (cs.CL)
computer
Software
Natural language processing

Details

ISSN :: 0167739X
Volume :: 116
Database :: OpenAIRE
Journal :: Future Generation Computer Systems
Accession number :: edsair.doi.dedup.....fce2641416a40ae07a61d088e2e81776

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources