Back to Search Start Over

Dug: A Semantic Search Engine Leveraging Peer-Reviewed Knowledge to Span Biomedical Data Repositories

Authors :
Alexander M. Waldrop
John B. Cheadle
Kira Bradford
Alexander Preiss
Robert Chew
Jonathan R. Holt
Nathan Braswell
Matt Watson
Andrew Crerar
Chris M. Ball
Yaphet Kebede
Carl Schreep
PJ Linebaugh
Hannah Hiles
Rebecca Boyles
Chris Bizon
Ashok Krishnamurthy
Steve Cox
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

MotivationAs the number of public data resources continues to proliferate, identifying relevant datasets across heterogenous repositories is becoming critical to answering scientific questions. To help researchers navigate this data landscape, we developed Dug: a semantic search tool for biomedical datasets utilizing evidence-based relationships from curated knowledge graphs to find relevant datasets and explain why those results are returned.ResultsDeveloped through the National Heart, Lung, and Blood Institute’s (NHLBI) BioData Catalyst ecosystem, Dug has indexed more than 15,911 study variables from public datasets. On a manually curated search dataset, Dug’s total recall (total relevant results/total results) of 0.79 outperformed default Elasticsearch’s total recall of 0.76. When using synonyms or related concepts as search queries, Dug (0.36) far outperformed Elasticsearch (0.14) in terms of total recall with no significant loss in the precision of its top results.Availability and ImplementationDug is freely available at https://github.com/helxplatform/dug. An example Dug deployment is also available for use at https://search.biodatacatalyst.renci.org/.Contactawaldrop@rti.org or scox@renci.org

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........7f0541135f72e4f61ad4b26227ddecd5
Full Text :
https://doi.org/10.1101/2021.07.07.451461