Back to Search
Start Over
Streamlining remote nanopore data access with slow5curl.
- Source :
-
GigaScience [Gigascience] 2024 Jan 02; Vol. 13. - Publication Year :
- 2024
-
Abstract
- Background: As adoption of nanopore sequencing technology continues to advance, the need to maintain large volumes of raw current signal data for reanalysis with updated algorithms is a growing challenge. Here we introduce slow5curl, a software package designed to streamline nanopore data sharing, accessibility, and reanalysis.<br />Results: Slow5curl allows a user to fetch a specified read or group of reads from a raw nanopore dataset stored on a remote server, such as a public data repository, without downloading the entire file. Slow5curl uses an index to quickly fetch specific reads from a large dataset in SLOW5/BLOW5 format and highly parallelized data access requests to maximize download speeds. Using all public nanopore data from the Human Pangenome Reference Consortium (>22 TB), we demonstrate how slow5curl can be used to quickly fetch and reanalyze raw signal reads corresponding to a set of target genes from each individual in large cohort dataset (n = 91), minimizing the time, egress costs, and local storage requirements for their reanalysis.<br />Conclusions: We provide slow5curl as a free, open-source package that will reduce frictions in data sharing for the nanopore community: https://github.com/BonsonW/slow5curl.<br /> (© The Author(s) 2024. Published by Oxford University Press GigaScience.)
- Subjects :
- Humans
Algorithms
Information Dissemination
Records
Nanopores
Nanopore Sequencing
Subjects
Details
- Language :
- English
- ISSN :
- 2047-217X
- Volume :
- 13
- Database :
- MEDLINE
- Journal :
- GigaScience
- Publication Type :
- Academic Journal
- Accession number :
- 38608279
- Full Text :
- https://doi.org/10.1093/gigascience/giae016