25 results on '"Emonet, Vincent"'
Search Results
2. A large collection of bioinformatics question-query pairs over federated knowledge graphs: methodology and applications
- Author
-
Bolleman, Jerven, Emonet, Vincent, Altenhoff, Adrian, Bairoch, Amos, Blatter, Marie-Claude, Bridge, Alan, Duvaud, Severine, Gasteiger, Elisabeth, Kuznetsov, Dmitry, Moretti, Sebastien, Michel, Pierre-Andre, Morgat, Anne, Pagni, Marco, Redaschi, Nicole, Zahn-Zabal, Monique, de Farias, Tarcisio Mendes, and Sima, Ana Claudia
- Subjects
Computer Science - Databases ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval - Abstract
Background. In the last decades, several life science resources have structured data using the same framework and made these accessible using the same query language to facilitate interoperability. Knowledge graphs have seen increased adoption in bioinformatics due to their advantages for representing data in a generic graph format. For example, yummydata.org catalogs more than 60 knowledge graphs accessible through SPARQL, a technical query language. Although SPARQL allows powerful, expressive queries, even across physically distributed knowledge graphs, formulating such queries is a challenge for most users. Therefore, to guide users in retrieving the relevant data, many of these resources provide representative examples. These examples can also be an important source of information for machine learning, if a sufficiently large number of examples are provided and published in a common, machine-readable and standardized format across different resources. Findings. We introduce a large collection of human-written natural language questions and their corresponding SPARQL queries over federated bioinformatics knowledge graphs (KGs) collected for several years across different research groups at the SIB Swiss Institute of Bioinformatics. The collection comprises more than 1000 example questions and queries, including 65 federated queries. We propose a methodology to uniformly represent the examples with minimal metadata, based on existing standards. Furthermore, we introduce an extensive set of open-source applications, including query graph visualizations and smart query editors, easily reusable by KG maintainers who adopt the proposed methodology. Conclusions. We encourage the community to adopt and extend the proposed methodology, towards richer KG metadata and improved Semantic Web services.
- Published
- 2024
3. Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning
- Author
-
Caufield, J Harry, Hegde, Harshad, Emonet, Vincent, Harris, Nomi L, Joachimiak, Marcin P, Matentzoglu, Nicolas, Kim, HyeongSik, Moxon, Sierra, Reese, Justin T, Haendel, Melissa A, Robinson, Peter N, and Mungall, Christopher J
- Subjects
Data Management and Data Science ,Information and Computing Sciences ,Networking and Information Technology R&D (NITRD) ,Generic health relevance ,Semantics ,Knowledge Bases ,Databases ,Factual ,Mathematical Sciences ,Biological Sciences ,Bioinformatics ,Biological sciences ,Information and computing sciences ,Mathematical sciences - Abstract
MotivationCreating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas.ResultsHere we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM.Availability and implementationSPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt.
- Published
- 2024
4. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning
- Author
-
Caufield, J. Harry, Hegde, Harshad, Emonet, Vincent, Harris, Nomi L., Joachimiak, Marcin P., Matentzoglu, Nicolas, Kim, HyeongSik, Moxon, Sierra A. T., Reese, Justin T., Haendel, Melissa A., Robinson, Peter N., and Mungall, Christopher J.
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Creating knowledge bases and ontologies is a time consuming task that relies on a manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrary complex nested knowledge schemas. Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning (ZSL) and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against GPT-3+ to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for all matched elements. We present examples of use of SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease causation graphs. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction (RE) methods, but has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM. SPIRES is available as part of the open source OntoGPT package: https://github.com/ monarch-initiative/ontogpt., Comment: Updated 2023-12-22
- Published
- 2023
- Full Text
- View/download PDF
5. Progress toward a universal biomedical data translator
- Author
-
Fecho, Karamarie, Thessen, Anne E, Baranzini, Sergio E, Bizon, Chris, Hadlock, Jennifer J, Huang, Sui, Roper, Ryan T, Southall, Noel, Ta, Casey, Watkins, Paul B, Williams, Mark D, Xu, Hao, Byrd, William, Dančík, Vlado, Duby, Marc P, Dumontier, Michel, Glusman, Gustavo, Harris, Nomi L, Hinderer, Eugene W, Hyde, Greg, Johs, Adam, Su, Andrew I, Qin, Guangrong, Zhu, Qian, Dougherty, Jennifer, Huang, Conrad, Magis, Andrew, Smith, Brett, Celebi, Remzi, Chen, Zhehuan, Azevedo, Ricardo De Miranda, Emonet, Vincent, Lee, Jay, Weng, Chunhua, Yilmaz, Arif, Kim, Keum Joo, Santos, Eugene, Tonstad, Lucas, Veenhuis, Luke, Yakaboski, Chase, Acevedo, Liliana, Carrell, Steven, Deutsch, Eric, Glen, Amy, Hoffman, Andrew, Koslicki, David, Kvarfordt, Lindsey, Liu, Zheng, Liu, Shaopeng, Ma, Chunyu, Mendoza, Luis, Muluka, Arun Teja, Womack, Finn, Wood, Erica, Roach, Jared, Goel, Prateek, Weber, Rosina, Williams, Andrew, Gormley, Joseph, Zisk, Tom, Hanspers, Kristina, Hoatlin, Maureen, Pico, Alexander, Riutta, Anders, Callaghan, Jackson, Xu, Colleen, Ahalt, Stanley C, Balhoff, Jim, Edwards, Stephen, Haaland, Perry, Knowles, Michael, Krishnamurthy, Ashok, Mandal, Meisha, Peden, David B, Pfaff, Emily, Schurman, Shepherd, Shrivastava, Shalki, Yi, Hong, Reilly, Jason, Kanwar, Richa, Cox, Steven, Vaidya, Gaurav, Wang, Max, Alkanaq, Ahmed, Costanzo, Maria, Koesterer, Ryan, Flannick, Jason, Burtt, Noel, Kluge, Alexandria, Rubin, Irit, Strasser, Michael Michi, Chung, Lawrence, Kang, Jimin, Mantilla, Michelle, Muller, Sandrine, Persaud, Bria, Wei, Qi, Baumgartner, Andrew, Dai, Cheng, and Duvvuri, Venkata
- Subjects
Pharmacology and Pharmaceutical Sciences ,Biomedical and Clinical Sciences ,Cardiovascular Medicine and Haematology ,Biomedical Data Translator Consortium ,Cardiorespiratory Medicine and Haematology ,Oncology and Carcinogenesis ,Other Medical and Health Sciences ,General Clinical Medicine ,Cardiovascular medicine and haematology ,Pharmacology and pharmaceutical sciences - Abstract
Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system's architecture, performance, and quality of results. We apply Translator to several real-world use cases developed in collaboration with subject-matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state-of-the-art, biomedical graph-based question-answering systems.
- Published
- 2022
6. Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science
- Author
-
Unni, Deepak R, Moxon, Sierra AT, Bada, Michael, Brush, Matthew, Bruskiewich, Richard, Caufield, J Harry, Clemons, Paul A, Dancik, Vlado, Dumontier, Michel, Fecho, Karamarie, Glusman, Gustavo, Hadlock, Jennifer J, Harris, Nomi L, Joshi, Arpita, Putman, Tim, Qin, Guangrong, Ramsey, Stephen A, Shefchek, Kent A, Solbrig, Harold, Soman, Karthik, Thessen, Anne E, Haendel, Melissa A, Bizon, Chris, Mungall, Christopher J, Consortium, The Biomedical Data Translator, Acevedo, Liliana, Ahalt, Stanley C, Alden, John, Alkanaq, Ahmed, Amin, Nada, Avila, Ricardo, Balhoff, Jim, Baranzini, Sergio E, Baumgartner, Andrew, Baumgartner, William, Belhu, Basazin, Brandes, MacKenzie, Brandon, Namdi, Burtt, Noel, Byrd, William, Callaghan, Jackson, Cano, Marco Alvarado, Carrell, Steven, Celebi, Remzi, Champion, James, Chen, Zhehuan, Chen, Mei‐Jan, Chung, Lawrence, Cohen, Kevin, Conlin, Tom, Corkill, Dan, Costanzo, Maria, Cox, Steven, Crouse, Andrew, Crowder, Camerron, Crumbley, Mary E, Dai, Cheng, Dančík, Vlado, De Miranda Azevedo, Ricardo, Deutsch, Eric, Dougherty, Jennifer, Duby, Marc P, Duvvuri, Venkata, Edwards, Stephen, Emonet, Vincent, Fehrmann, Nathaniel, Flannick, Jason, Foksinska, Aleksandra M, Gardner, Vicki, Gatica, Edgar, Glen, Amy, Goel, Prateek, Gormley, Joseph, Greyber, Alon, Haaland, Perry, Hanspers, Kristina, He, Kaiwen, Henrickson, Jeff, Hinderer, Eugene W, Hoatlin, Maureen, Hoffman, Andrew, Huang, Sui, Huang, Conrad, Hubal, Robert, Huellas‐Bruskiewicz, Kenneth, Huls, Forest B, Hunter, Lawrence, Hyde, Greg, Issabekova, Tursynay, Jarrell, Matthew, Jenkins, Lindsay, Johs, Adam, Kang, Jimin, Kanwar, Richa, Kebede, Yaphet, Kim, Keum Joo, Kluge, Alexandria, Knowles, Michael, and Koesterer, Ryan
- Subjects
Pharmacology and Pharmaceutical Sciences ,Biomedical and Clinical Sciences ,Cardiovascular Medicine and Haematology ,Networking and Information Technology R&D (NITRD) ,Data Science ,2.6 Resources and infrastructure (aetiology) ,Aetiology ,Generic health relevance ,Knowledge ,Pattern Recognition ,Automated ,Translational Science ,Biomedical ,Biomedical Data Translator Consortium ,Cardiorespiratory Medicine and Haematology ,Oncology and Carcinogenesis ,Other Medical and Health Sciences ,General Clinical Medicine ,Cardiovascular medicine and haematology ,Pharmacology and pharmaceutical sciences - Abstract
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
- Published
- 2022
7. Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019
- Author
-
Abbas, Nacira, Alghamdi, Kholoud, Alinam, Mortaza, Alloatti, Francesca, Amaral, Glenda, d'Amato, Claudia, Asprino, Luigi, Beno, Martin, Bensmann, Felix, Biswas, Russa, Cai, Ling, Capshaw, Riley, Carriero, Valentina Anita, Celino, Irene, Dadoun, Amine, De Giorgis, Stefano, Delva, Harm, Domingue, John, Dumontier, Michel, Emonet, Vincent, van Erp, Marieke, Arias, Paola Espinoza, Fallatah, Omaima, Ferrada, Sebastián, Ocaña, Marc Gallofré, Georgiou, Michalis, Gesese, Genet Asefa, Gillis-Webber, Frances, Giovannetti, Francesca, Buey, Marìa Granados, Harrando, Ismail, Heibi, Ivan, Horta, Vitor, Huber, Laurine, Igne, Federico, Jaradeh, Mohamad Yaser, Keshan, Neha, Koleva, Aneta, Koteich, Bilal, Kurniawan, Kabul, Liu, Mengya, Ma, Chuangtao, Maas, Lientje, Mansfield, Martin, Mariani, Fabio, Marzi, Eleonora, Mesbah, Sepideh, Mistry, Maheshkumar, Tirado, Alba Catalina Morales, Nguyen, Anna, Nguyen, Viet Bach, Oelen, Allard, Pasqual, Valentina, Paulheim, Heiko, Polleres, Axel, Porena, Margherita, Portisch, Jan, Presutti, Valentina, Pustu-Iren, Kader, Mendez, Ariam Rivas, Roshankish, Soheil, Rudolph, Sebastian, Sack, Harald, Sakor, Ahmad, Salas, Jaime, Schleider, Thomas, Shi, Meilin, Spinaci, Gianmarco, Sun, Chang, Tietz, Tabea, Dhouib, Molka Tounsi, Umbrico, Alessandro, Berg, Wouter van den, and Xu, Weiqin
- Subjects
Computer Science - Artificial Intelligence - Abstract
One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution.
- Published
- 2020
8. The FAIR extension: A web browser extension to evaluate Digital Object FAIRness
- Author
-
Hernández Serrano, Pedro V., Al-Saeedi, Ali, Emonet, Vincent, and Diblen, Faruk
- Subjects
FAIR metrics ,FAIR assessment ,FAIR Digital Objects - Abstract
The scientific community’s efforts have increased regarding evaluating the FAIRness of research Digital Objects such as publications, datasets, or research software. However, this requires a steep learning curve for the average researcher when learning the FAIR evaluation frameworks, disengaging some of them in the process. This project aims to use technology close this gap and make this process more accessible by bringing the FAIR evaluation to the researcher’s profiles. “The FAIR extension” is an open-source, web browser extension that allows researchers to make FAIR metricsevaluations directly from scholarly aggregators such as PURE, Google Scholar, or ResearchGate without any additional knowledge. The browser extension follows the FAIR metrics group specification, building on top of the community-accepted FAIR evaluators’ APIs. 🕶 This resource is presented atthe 1st International Conference on FAIR Digital Objectswhich is linked to the following abstract doi.org/10.3897/rio.8.e95006 🧩 Install it in your browser!!Available in the chrome store:"the FAIR extension" 🔍 For more information, please visit the the FAIR extension website  
- Published
- 2022
- Full Text
- View/download PDF
9. The FAIR extension: A web browser extension to evaluate Digital Object FAIRness
- Author
-
Hernandez Serrano, Pedro, primary and Emonet, Vincent, additional
- Published
- 2022
- Full Text
- View/download PDF
10. Towards an extensible FAIRness assessment of FAIR Digital Objects
- Author
-
Emonet, Vincent, primary, Çelebi, Remzi, additional, Yang, Jinzhou, additional, and Dumontier, Michel, additional
- Published
- 2022
- Full Text
- View/download PDF
11. A comprehensive comparison of automated FAIRness Evaluation Tools
- Author
-
Sun, Chang, Emonet, Vincent, Dumontier, Michel, Sun, Chang, Emonet, Vincent, and Dumontier, Michel
- Abstract
The FAIR Guiding Principles (Findable, Accessible, Interop- erable, and Reusable) have been widely endorsed by the scientific community, funding agencies, and policymakers. However, the FAIR principles leave ample room for different implementations, and several groups have worked towards manual, semi-automatic, and automatic approaches to evaluate the FAIRness of digital objects. This study compares and con- trasts three automated FAIRness evaluation tools namely F-UJI, the FAIR Evaluator, and FAIR Checker. We examine three aspects: 1) tool characteristics, 2) the evaluation metrics, and 3) metrics tests for three public datasets. We find significant differences in the evaluation results for tested resources, along with differences in the design, implementation, and documentation of the evaluation metrics and platforms. While auto- mated tools do test a wide breadth of technical expectations of the FAIR principles, we put forward specific recommendations for their improved utility, transparency, and interpretability.
- Published
- 2022
12. Data2Services: enabling automated conversion of data to services
- Author
-
Emonet, Vincent, Emonet, Vincent, Zaveri, Amrapali, Malic, Alexander, Grigoriu, Andreea, Dumontier, Michel, Emonet, Vincent, Emonet, Vincent, Zaveri, Amrapali, Malic, Alexander, Grigoriu, Andreea, and Dumontier, Michel
- Published
- 2018
13. Semantic micro-contributions with decentralized nanopublication services
- Author
-
Kuhn, Tobias, primary, Taelman, Ruben, additional, Emonet, Vincent, additional, Antonatos, Haris, additional, Soiland-Reyes, Stian, additional, and Dumontier, Michel, additional
- Published
- 2021
- Full Text
- View/download PDF
14. Transformation and integration of heterogeneous health data in a privacy-preserving distributed learning infrastructure
- Author
-
Sun, Chang, Emonet, Vincent, van Soest, Johan, Koster, Annemarie, Dekker, Andre, Dumontier, Michel, Sun, Chang, Emonet, Vincent, van Soest, Johan, Koster, Annemarie, Dekker, Andre, and Dumontier, Michel
- Abstract
Problem statement: A growing volume and variety of personal health data are being collected by different entities, such as healthcare providers, insurance companies, and wearable device manufacturers. Combining heterogeneous health data offers unprecedented opportunities to augment our understanding of human health and disease. However, a major challenge to research lies in the difficulty of accessing and analyzing health data that are dispersed in their format (e.g. CSV, XML), sources (e.g., medical records, laboratory data), representation (unstructured, structured), and governance (e.g., data collection and maintenance)[2]. Such considerations are crucial when we link and use personal health data across multiple legal entities with different data governance and privacy concerns.
- Published
- 2019
15. The FAIR extension: A web browser extension to evaluate Digital Object FAIRness.
- Author
-
Serrano, Pedro Hernandez and Emonet, Vincent
- Subjects
WEB browsers ,COMPUTER software ,DIGITAL technology ,CHROMIUM ,AWARENESS - Abstract
The scientific community's efforts have increased regarding the application and assessment of the FAIR principles on Digital Objects (DO) such as publications, datasets, or research software. Consequently, openly available automated FAIR assessment services have been working on standardization, such as FAIR enough, the FAIR evaluator or FAIRsFAIR's F-UJI. Digital Competence Centers such as University Libraries have been paramount in this process by facilitating a range of activities, such as awareness campaigns, trainings, or systematic support. However, in practice, using the FAIR assessment tools is still an intricate process for the average researcher. It requires a steep learning curve since it involves performing a series of manual processes requiring specific knowledge when learning the frameworks, disengaging some some researchers in the process. We aim to use technology to close this gap and make this process more accessible by bringing the FAIR assessment to the researcher's profiles. We will develop "The FAIR extension", an open-source, user-friendly web browser extension that allows researchers to make FAIR assessment directly at the web source. Web browser extensions have been an accessible digital tool for libraries supporting scholarship (De Sarkar 2015). A remarkable example is the lightweight version of reference managers deployed as a browser service (Ferguson 2019). Moreover, it has been demonstrated that they can be a vehicle for open access, such as Lean Library Browser Extension. The FAIR extension is a service that builds on top of the community-accepted FAIR evaluator APIs, i.e. it does not intend to create yet another FAIR assessment framework from scratch. The objective of the FAIR Digital Objects Framework (FDOF) is for objects published in a digital environment to comply with a set of requirements, such as identifiability, and the use of a rich metadata record (Santos 2021, Schultes and Wittenburg 2019). The FAIR extension will connect via REST-like operations to individual FAIR metrics test endpoints, according to Wilkinson et al. (2018), Wilkinson et al. (2019) and ultimately display the FAIR metrics on the client side (Fig. 1). Ultimately, the user will get FAIR scores of articles, datasets and other DOs in real-time on a web source, such as a scholarly platform or DO repository. With the possibility of creating simple reports of the assessment. It is acknowledged that the development of web-based tools carries some constraints regarding platform versions releases, e.g. Chromium Development Calendar. Nevertheless, we are optimistic about the potential use cases. For example, 1. A student wanting to make use of a DO (e.g. software package), but doesn't know which to choose. The FAIR extension will indicate which one is more FAIR and aid the decision making process 2. A Data steward recommending sources 3. A researcher who wants to display all FAIR metrics of her DOs on a research profile 4. A PI that wants to evaluate an aggregated metric for a project. These use cases can be the means to bringing the open source community and FAIR DO interest groups to work together. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Harnessing the power of unified metadata in an ontology repository: The case of AgroPortal
- Author
-
Jonquet, Clement, Toulet, Anne, Dutta, Biswanath, Emonet, Vincent, Système Multi-agent, Interaction, Langage, Evolution (SMILE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Indian Statistical Institute [Bangalore] (ISI), ANR-10-LABX-0020,NUMEV,Digital and Hardware Solutions and Modeling for the Environement and Life Sciences(2010), ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011), European Project: 701771,H2020,H2020-MSCA-IF-2015,SIFRm(2016), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
AgroPortal ,Utilisation des terres ,Agronomie ,Méthode statistique ,U10 - Informatique, mathématiques et statistiques ,Semantic description ,[INFO.INFO-WB]Computer Science [cs]/Web ,Ontology relation ,Ontology repository ,Ontology metadata vocabulary ,ontologie d’application ,C30 - Documentation et information ,Ontology selection ,BioPortal ,ontologie ,vocabulaire ,Modèle mathématique - Abstract
International audience; As any resources, ontologies, thesaurus, vocabularies and terminologies need to be described with relevant metadata to facilitate their identification, selection and reuse. For ontologies to be FAIR, there is a need for metadata authoring guidelines and for harmonization of existing metadata vocabularies –taken independently none of them can completely describe an ontology. Ontology libraries and repositories also have to play an important role. Indeed, some metadata properties are intrinsic to the ontology (name, license, description); other information, such as community feedbacks, or relations to other ontologies are typically information that an ontology library shall capture, populate and consolidate to facilitate the processes of identifying and selecting the right ontology(ies) to use. We have studied ontology metadata practices by: (i) analyzing metadata annotations of 805 ontologies; (ii) reviewing the most standard and relevant vocabularies (23 totals) currently available to describe metadata for ontologies (such as Dublin Core, Ontology Metadata Vocabulary, VoID, etc.); (iii) comparing different metadata implementation in multiple ontology libraries or repositories. We have then built a new metadata model for our AgroPortal vocabulary and ontology repository, a platform dedicated to agronomy based on the NCBO BioPortal technology. AgroPortal now recognizes 346 properties from existing metadata vocabularies that could be used to describe different aspects of ontologies: intrinsic descriptions, people, date, relations, content, metrics, community, administration, and access. We use them to populate an internal model of 127 properties implemented in the portal and harmonized for all the ontologies. We –and AgroPortal's users– have spent a significant amount of time to edit and curate the metadata of the ontologies to offer a better synthetized and harmonized information and enable new ontology identification features. Our goal was also to facilitate the comprehension of the agronomical ontology landscape by displaying diagrams and charts about all the ontologies on the portal. We have evaluated our work with a user appreciation survey which confirms the new features are indeed relevant and helpful to ease the processes of identification and selection of ontologies. This paper presents how to harness the potential of a complete and unified metadata model with dedicated features in an ontology repository, however the new AgroPortal's model is not a new vocabulary as it relies on pre-existing ones. A generalization of this work is studied in a community driven standardization effort in the context of the RDA Vocabulary and Semantic Services Interest Group.
- Published
- 2018
- Full Text
- View/download PDF
17. ICD-10 coding of death certificates with the NCBO and SIFR Annotators at CLEF eHealth 2017
- Author
-
Tchechmedjiev, Andon, Abdaoui, Amine, Emonet, Vincent, Jonquet, Clement, Système Multi-agent, Interaction, Langage, Evolution (SMILE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Stanford Center for BioMedical Informatics Research (BMIR), Stanford University, ANR-12-JS02-0010,SIFR,Indexation sémantique de ressources biomédicales francophones(2012), ANR-15-CE23-0028,PractiKPharma,Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique(2015), European Project: 701771,H2020,H2020-MSCA-IF-2015,SIFRm(2016), Jonquet, Clement, JCJC - SIMI 2 - Science informatique et applications - Indexation sémantique de ressources biomédicales francophones - - SIFR2012 - ANR-12-JS02-0010 - JC - VALID, Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance - Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique - - PractiKPharma2015 - ANR-15-CE23-0028 - AAPG2015 - VALID, Semantic Indexing of French Biomedical Data Resources - mobility - SIFRm - - H20202016-09-01 - 2019-08-31 - 701771 - VALID, and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,Semantic annotation ,NCBO annotator ,[INFO.INFO-WB] Computer Science [cs]/Web ,SIFR annotator ,[INFO.INFO-WB]Computer Science [cs]/Web ,[INFO.INFO-TT] Computer Science [cs]/Document and Text Processing ,ICD-10 coding ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Biomedical ontologies ,[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; The SIFR BioPortal is an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology (NCBO). The portal facilitates the use and fostering of terminologies and ontologies by offering a set of services including semantic annotation. The SIFR Anno-tator (http://bioportal.lirmm.fr/annotator) is a publicly accessible, easily usable ontology-based annotation tool to process French text data and facilitate semantic indexing. The web service relies on the ontology content (preferred labels and synonyms) as well as on the semantic of the ontologies (is-a hierarchies) and their mappings. The SIFR BioPortal also offers the possibility of querying the original NCBO Annotator for English text via a dedicated proxy that extends the original functionality. In this paper, we present a preliminary performance evaluation of the generic annotation web service (i.e., not specifically customized) for coding death certificates i.e., annotating with ICD-10 codes. This evaluation is done against the CépiDC/CDC CLEF eHealth 2017 task 1 manually annotated corpus. For this purpose, we have built custom SKOS vocabularies from the CéPIDC/CDC dictionaries as well as training and development corpora, for all three tasks using a most frequent code heuristic to assign ambiguous labels. We then submitted the vocabularies to the NCBO and SIFR BioPortal and used the annotation services on task 1 datasets. We obtained, for our best runs on each corpus the following results: English raw corpus (69.08% P, 51.37% R, 58,92% F1); French raw corpus (54.11% P, 48.00% R, 50,87% F1); French aligned corpus (50.63% P, 52.97% R, 51.77% F1).
- Published
- 2017
18. SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes
- Author
-
Tchechmedjiev, Andon, primary, Abdaoui, Amine, additional, Emonet, Vincent, additional, Zevio, Stella, additional, and Jonquet, Clement, additional
- Published
- 2018
- Full Text
- View/download PDF
19. Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator+
- Author
-
Tchechmedjiev, Andon, primary, Abdaoui, Amine, additional, Emonet, Vincent, additional, Melzi, Soumia, additional, Jonnagaddala, Jitendra, additional, and Jonquet, Clement, additional
- Published
- 2018
- Full Text
- View/download PDF
20. AgroPortal: an ontology repository for agronomy
- Author
-
Jonquet, Clément, Toulet, Anne, Emonet, Vincent, Larmande, Pierre, Jonquet, Clément, Toulet, Anne, Emonet, Vincent, and Larmande, Pierre
- Abstract
Many vocabularies and ontologies are produced to represent and annotate agronomic data. Therefore, there is a need of a common platform to identify, host and use them in agro-informatics application. By reusing the NCBO BioPortal technology, we have designed AgroPortal an ontology repository for the agronomy domain. The AgroPortal project aims at reusing the scientific outcomes and experience of the biomedical domain in the context of plant, agronomy, food, and biodiversity. We offer an ontology portal which features ontology hosting, search, versioning, visualization, comment, recommendation, enables semantic annotation, as well as storing and exploiting ontology alignments. All of these within a fully semantic web compliant infrastructure. The AgroPortal specifically pays attention to respect the requirements of the agronomic community in terms of ontology formats (e.g., SKOS, trait dictionaries) or supported features. In this demonstration, we will present our platform currently open and accessible at http://agroportal.lirmm.fr.
- Published
- 2017
21. Modèle de métadonnées dans un portail d'ontologies
- Author
-
Toulet, Anne, Emonet, Vincent, Jonquet, Clement, Système Multi-agent, Interaction, Langage, Evolution (SMILE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), ANR-10-LABX-0020,NUMEV,Digital and Hardware Solutions and Modeling for the Environement and Life Sciences(2010), ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011), and ANR-12-JS02-0010,SIFR,Indexation sémantique de ressources biomédicales francophones(2012)
- Subjects
AgroPortal ,semantic description ,BioPortal ,portail d'ontologies ,[INFO]Computer Science [cs] ,ontology ,standard W3C ,vocabulaire de métadonnées ,ontology repository ,W3C standards ,web sémantique ,metadata vocabulary ,ontologie - Abstract
National audience; Scientific communities are using an increasing number of ontologies. Repositories make them available, like the NCBO BioPortal which currently hosts more than 500 biomedical ontologies. Now the question is how to find the ontology we need? One solution is to describe each ontology with appropriate metadata. However, none of the existing metadata vocabularies can completely meet this need if taken independently. We have reviewed a large number of vocabularies, such as Dublin Core, OMV, DCAT, or VOID, as well as the properties implemented by common ontology repositories. We then listed those properties into a simplified model of 124 properties. We present a few examples of use of these properties within the AgroPortal, an ontology repository for agronomy, and explain how the portal handles these properties to facilitate ontology description and selection.; Les communautés scientifiques utilisent un nombre croissant d'ontologies. Pour les mettre à disposition, il existe des portails d'ontologies, à l'exemple du NCBO BioPortal qui regroupe actuellement plus de 500 ontologies biomédicales. Mais face à cette avalanche de ressources, comment trouver l'ontologie qui répondra à nos besoins ? Une solution consiste à décrire chaque ontologie avec des métadonnées appropriées. Or, il n'existe pas à ce jour de vocabulaire de métadonnées suffisamment exhaustif pour répondre à ce besoin. Nous avons passé en revue un grand nombre de vocabulaires, tels que Dublin Core, OMV, DCAT ou VOID ainsi que les propriétés implémentées par les portails d'ontologies les plus courant. Nous en avons produit un modèle simplifié composé de 124 propriétés. Nous présentons ici quelques exemples d'utilisation de ces propriétés dans AgroPortal, un portail d'ontologies dédié à l'agronomie, et nous expliquons comment elles sont gérées et utilisées pour la description et l'identification d'ontologies.
- Published
- 2016
22. AgroPortal: an open repository of ontologies and vocabularies for agriculture and nutrition data
- Author
-
Jonquet, Clement, Toulet, Anne, Arnaud, Elizabeth, Aubin, Sophie, Yeumo, Esther Dzalé, Emonet, Vincent, Pesce, Valeria, Larmande, Pierre, Système Multi-agent, Interaction, Langage, Evolution (SMILE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Bioversity International, Consultative Group on International Agricultural Research [CGIAR], DIST Délégation Information Scientifique et Technique (DV-IST), Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Food and Agricultural Organization of the United Nations (FAO), United Nations Organization, Diversité, adaptation, développement des plantes (UMR DIADE), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche pour le Développement (IRD [France-Sud]), Scientific Data Management (ZENITH), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Institut de Biologie Computationnelle (IBC), Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Ben Schaap, ANR-12-JS02-0010,SIFR,Indexation sémantique de ressources biomédicales francophones(2012), ANR-11-BINF-0002,IBC,Institut de Biologie Computationnelle de Montpellier(2011), ANR-10-LABX-0020/10-LABX-0020,NUMEV,Digital and Hardware Solutions and Modeling for the Environement and Life Sciences(2010), Bioversity International [Montpellier], Bioversity International [Rome], Consultative Group on International Agricultural Research [CGIAR] (CGIAR)-Consultative Group on International Agricultural Research [CGIAR] (CGIAR), Institut National de la Recherche Agronomique (INRA), Food and Agriculture Organization of the United Nations [Rome, Italie] (FAO), ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011), ANR-10-LABX-0020,NUMEV,Digital and Hardware Solutions and Modeling for the Environement and Life Sciences(2010), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Institut de Recherche pour le Développement (IRD [France-Sud])-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), and Université de Montpellier (UM)-Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
AgroPortal ,Ontologies ,[INFO]Computer Science [cs] ,Agronomy - Abstract
International audience; Similarly to what happens in biomedicine, communities engaged in agronomic research need to access specific sets of ontologies for data annotation and integration. For instance, it has been established that the scientific challenges in plant breeding have switched from genetics to phenotyping and that standard traits/phenotypes vocabularies are necessary to facilitate breeder’s data integration and comparison. In parallel of very specific crop dictionaries, important organizations have produced large reference vocabularies such as AGROVOC (Food and Agriculture Organization), NAL Thesaurus (National Agricultural Library) or the CAB Thesaurus (Centre for Agricultural Bioscience International). The more ontologies are being produced in the domain, the more the need to create, store and retrieve alignments between those ontologies become important. In fact, there exists a need of a one-stop-shop for agronomical, environmental and food related ontologies enabling to identify and select an ontology for a specific task as well as offering generic services to exploit them in search, annotation or other scientific data management processes
- Published
- 2016
23. Data and text mining: Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator+.
- Author
-
Tchechmedjiev, Andon, Abdaoui, Amine, Emonet, Vincent, Melzi, Soumia, Jonnagaddala, Jitendra, and Jonquet, Clement
- Subjects
DATA mining ,ONTOLOGIES (Information retrieval) ,BIOMEDICAL engineering ,WEB services ,NATURAL language processing - Abstract
Summary: Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014). [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
24. Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator+
- Author
-
Tchechmedjiev, Andon, Abdaoui, Amine, Emonet, Vincent, Melzi, Soumia, Jonnagaddala, Jitendra, Jonquet, Clement, Système Multi-agent, Interaction, Langage, Evolution (SMILE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), ADVanced Analytics for data SciencE (ADVANSE), UNSW Faculty of Medicine [Sydney], University of New South Wales [Sydney] (UNSW), Stanford Center for BioMedical Informatics Research (BMIR), Stanford University, ANR-12-JS02-0010,SIFR,Indexation sémantique de ressources biomédicales francophones(2012), ANR-15-CE23-0028,PractiKPharma,Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique(2015), European Project: 701771,H2020,H2020-MSCA-IF-2015,SIFRm(2016), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
NCBO Annotator ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,Semantic annotation ,Text mining ,[INFO.INFO-WB]Computer Science [cs]/Web ,Ontologies ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Biomedical ontologies ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; Summary: Second use of clinical data commonly involves annotating biomedical text with terminologies and ontolo-gies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally de-signed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats, and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clini-cal context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014).Availability and Implementation: The Annotator+ has been successfully integrated into the SIFR BioPortal platform –an implementation of NCBO BioPortal for French biomedical terminologies and ontologies– to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly availa-ble, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g., clinical) data in-house (https://github.com/sifrproject).Contact: andon.tchechmedjiev@lirmm.fr and jonquet@lirmm.frSupplementary information: Technical details and documentation available online.
- Full Text
- View/download PDF
25. AgroPortal: an ontology repository for agronomy
- Author
-
Jonquet, Clément, Anne Toulet, Emonet, Vincent, and Larmande, Pierre
- Abstract
Many vocabularies and ontologies are produced to represent and annotate agronomic data. Therefore, there is a need of a common platform to identify, host and use them in agro-informatics application. By reusing the NCBO BioPortal technology, we have designed AgroPortal an ontology repository for the agronomy domain. The AgroPortal project aims at reusing the scientific outcomes and experience of the biomedical domain in the context of plant, agronomy, food, and biodiversity. We offer an ontology portal which features ontology hosting, search, versioning, visualization, comment, recommendation, enables semantic annotation, as well as storing and exploiting ontology alignments. All of these within a fully semantic web compliant infrastructure. The AgroPortal specifically pays attention to respect the requirements of the agronomic community in terms of ontology formats (e.g., SKOS, trait dictionaries) or supported features. In this demonstration, we will present our platform currently open and accessible at http://agroportal.lirmm.fr.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.