Back to Search
Start Over
SAMBLASTER: fast duplicate marking and structural variant read extraction.
- Source :
-
Bioinformatics . Sep2014, Vol. 30 Issue 17, p2503-2505. 3p. - Publication Year :
- 2014
-
Abstract
- Motivation: Illumina DNA sequencing is now the predominant source of raw genomic data, and data volumes are growing rapidly. Bioinformatic analysis pipelines are having trouble keeping pace. A common bottleneck in such pipelines is the requirement to read, write, sort and compress large BAM files multiple times.Results: We present SAMBLASTER, a tool that reduces the number of times such costly operations are performed. SAMBLASTER is designed to mark duplicates in read-sorted SAM files as a piped post-pass on DNA aligner output before it is compressed to BAM. In addition, it can simultaneously output into separate files the discordant read-pairs and/or split-read mappings used for structural variant calling. As an alignment post-pass, its own runtime overhead is negligible, while dramatically reducing overall pipeline complexity and runtime. As a stand-alone duplicate marking tool, it performs significantly better than PICARD or SAMBAMBA in terms of both speed and memory usage, while achieving nearly identical results.Availability and implementation: SAMBLASTER is open-source C++ code and freely available for download from https://github.com/GregoryFaust/samblaster.Contact: imh4y@virginia.edu [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13674803
- Volume :
- 30
- Issue :
- 17
- Database :
- Academic Search Index
- Journal :
- Bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 97825443
- Full Text :
- https://doi.org/10.1093/bioinformatics/btu314