Back to Search
Start Over
Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue
- Source :
- Bioinformatics, 38 (22)
- Publication Year :
- 2022
- Publisher :
- Oxford University Press (OUP), 2022.
-
Abstract
- Motivation The volume of public nucleotide sequence data has blossomed over the past two decades and is ripe for re- and meta-analyses to enable novel discoveries. However, reproducible re-use and management of sequence datasets and associated metadata remain critical challenges. We created the open source Python package q2-fondue to enable user-friendly acquisition, re-use and management of public sequence (meta)data while adhering to open data principles. Results q2-fondue allows fully provenance-tracked programmatic access to and management of data from the NCBI Sequence Read Archive (SRA). Unlike other packages allowing download of sequence data from the SRA, q2-fondue enables full data provenance tracking from data download to final visualization, integrates with the QIIME 2 ecosystem, prevents data loss upon space exhaustion and allows download of (meta)data given a publication library. To highlight its manifold capabilities, we present executable demonstrations using publicly available amplicon, whole genome and metagenome datasets. Availability and implementation q2-fondue is available as an open-source BSD-3-licensed Python package at https://github.com/bokulich-lab/q2-fondue. Usage tutorials are available in the same repository. All Jupyter notebooks used in this article are available under https://github.com/bokulich-lab/q2-fondue-examples. Supplementary information Supplementary data are available at Bioinformatics online.<br />Bioinformatics, 38 (22)<br />ISSN:1367-4803<br />ISSN:1460-2059
Details
- ISSN :
- 13674811, 13674803, and 14602059
- Volume :
- 38
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....bff2941b0aa844663c9cdc9323425e39