1. PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison
- Author
-
Pierre Monnin, Clement Jonquet, Amedeo Napoli, Andon Tchechmedjiev, Adrien Coulet, Joël Legrand, Patrice Ringot, Graziella Husson, Knowledge representation, reasonning (ORPAILLEUR), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), WEB-CUBE, Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Stanford Center for BioMedical Informatics Research (BMIR), Stanford University, Equipe associée Snowball, ANR-15-CE23-0028,PractiKPharma,Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique(2015), ANR-15-IDEX-0004,LUE,Isite LUE(2015), European Project: 701771,H2020,H2020-MSCA-IF-2015,SIFRm(2016), Service Informatique de Soutien à la Recherche (SISR), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), WEB Architecture x Semantic WEB x WEB of Data (WEB3), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
PharmGKB ,Databases, Factual ,Computer science ,Knowledge Bases ,Tissue Banks ,Ontology (information science) ,lcsh:Computer applications to medicine. Medical informatics ,030226 pharmacology & pharmacy ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,03 medical and health sciences ,0302 clinical medicine ,Similarity (psychology) ,Data Mining ,Electronic Health Records ,Humans ,Linked open data ,Set (psychology) ,lcsh:QH301-705.5 ,Semantic Web ,030304 developmental biology ,Knowledge engineering ,0303 health sciences ,Information retrieval ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,business.industry ,Ontology ,Methodology ,Linked data ,Knowledge comparison ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,lcsh:Biology (General) ,Knowledge base ,Pharmacogenetics ,Pharmacogenomics ,[SDV.SP.PHARMA]Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology ,lcsh:R858-859.7 ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,business ,Semantic web - Abstract
BackgroundPharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant – drug – adverse event. Such a relationship states that an adverse event may occur for patients having the specified gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks “clinical counterpart” or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources.ResultsIn this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and levels of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2,701), the literature (65,720) and from discoveries reported in EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins.ConclusionsThe proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources.
- Published
- 2019
- Full Text
- View/download PDF