Back to Search Start Over

Mining Linked Open Data: a Case Study with Genes Responsible for Intellectual Disability

Authors :
Malika Smaïl-Tabbone
Adrien Coulet
Marie-Dominique Devignes
Céline Bonnet
Gabin Personeni
Philippe Jonveaux
Simon Daget
Knowledge representation, reasonning (ORPAILLEUR)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
Service de Génétique [CHRU Nancy]
Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)
Helena Galhardas, Erhard Rahm
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Source :
ECCB'14 (European Conference on Computational Biology 2014), ECCB'14 (European Conference on Computational Biology 2014), Sep 2014, Strasbourg, France. 2014, Data Integration in the Life Sciences-10th International Conference, DILS 2014, Data Integration in the Life Sciences-10th International Conference, DILS 2014, Jul 2014, Lisbon, Portugal. pp.16-31, ⟨10.1007/978-3-319-08590-6_2⟩, ECCB'14 (European Conference on Computational Biology 2014), Sep 2014, Strasbourg, France., 2014, Lecture Notes in Computer Science ISBN: 9783319085890, DILS
Publication Year :
2014
Publisher :
HAL CCSD, 2014.

Abstract

Linked Open Data (LOD) constitute a unique dataset that is in a standard format, partially integrated, and facilitates connections with domain knowledge represented within semantic web ontologies. Increasing amounts of biomedical data provided as LOD consequently offer novel opportunities for knowledge discovery in biomedicine. However, most data mining methods are neither adapted to LOD format, nor adapted to consider domain knowledge. We propose in this paper an approach for selecting, integrating, and mining LOD with the goal of discovering genes responsible for a disease. The selection step relies on a set of choices made by a domain expert to isolate relevant pieces of LOD. Because these pieces are potentially not linked, an integration step is required to connect unlinked pieces. The resulting graph is subsequently mined using Inductive Logic Programming (ILP) that presents two main advantages. First, the input format compliant with ILP is close to the format of LOD. Second, domain knowledge can be added to this input and considered by ILP. We have implemented and applied this approach to the characterization of genes responsible for intellectual disability. On the basis of this real-world use case, we present an evaluation of our mining approach and discuss its advantages and drawbacks for the mining of biomedical LOD.

Details

Language :
English
ISBN :
978-3-319-08589-0
ISBNs :
9783319085890
Database :
OpenAIRE
Journal :
ECCB'14 (European Conference on Computational Biology 2014), ECCB'14 (European Conference on Computational Biology 2014), Sep 2014, Strasbourg, France. 2014, Data Integration in the Life Sciences-10th International Conference, DILS 2014, Data Integration in the Life Sciences-10th International Conference, DILS 2014, Jul 2014, Lisbon, Portugal. pp.16-31, ⟨10.1007/978-3-319-08590-6_2⟩, ECCB'14 (European Conference on Computational Biology 2014), Sep 2014, Strasbourg, France., 2014, Lecture Notes in Computer Science ISBN: 9783319085890, DILS
Accession number :
edsair.doi.dedup.....879c4e79cb16222c0df9c957af5da054
Full Text :
https://doi.org/10.1007/978-3-319-08590-6_2⟩