Back to Search
Start Over
KvarQ: targeted and direct variant calling from fastq reads of bacterial genomes
- Source :
- BMC genomics, BMC Genomics
- Publication Year :
- 2014
-
Abstract
- Background High-throughput DNA sequencing produces vast amounts of data, with millions of short reads that usually have to be mapped to a reference genome or newly assembled. Both reference-based mapping and de novo assembly are computationally intensive, generating large intermediary data files, and thus require bioinformatics skills that are often lacking in the laboratories producing the data. Moreover, many research and practical applications in microbiology require only a small fraction of the whole genome data. Results We developed KvarQ, a new tool that directly scans fastq files of bacterial genome sequences for known variants, such as single nucleotide polymorphisms (SNP), bypassing the need of mapping all sequencing reads to a reference genome and de novo assembly. Instead, KvarQ loads “testsuites” that define specific SNPs or short regions of interest in a reference genome, and directly synthesizes the relevant results based on the occurrence of these markers in the fastq files. KvarQ has a versatile command line interface and a graphical user interface. KvarQ currently ships with two “testsuites” for Mycobacterium tuberculosis, but new “testsuites” for other organisms can easily be created and distributed. In this article, we demonstrate how KvarQ can be used to successfully detect all main drug resistance mutations and phylogenetic markers in 880 bacterial whole genome sequences. The average scanning time per genome sequence was two minutes. The variant calls of a subset of these genomes were validated with a standard bioinformatics pipeline and revealed >99% congruency. Conclusion KvarQ is a user-friendly tool that directly extracts relevant information from fastq files. This enables researchers and laboratory technicians with limited bioinformatics expertise to scan and analyze raw sequencing data in a matter of minutes. KvarQ is open-source, and pre-compiled packages with a graphical user interface are available at http://www.swisstph.ch/kvarq. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-881) contains supplementary material, which is available to authorized users.
- Subjects :
- FASTQ format
Sequence assembly
Hybrid genome assembly
Bacterial genome size
Computational biology
Biology
Genome
Polymorphism, Single Nucleotide
DNA sequencing
03 medical and health sciences
In-silico SNP-typing
User-Computer Interface
Drug Resistance, Bacterial
Genetics
FastQ
Phylogeny
030304 developmental biology
Whole genome sequencing
0303 health sciences
Internet
Bacteria
030306 microbiology
Single nucleotide polymorphisms
Mycobacterium tuberculosis
3. Good health
Software
Algorithms
Genome, Bacterial
Reference genome
Biotechnology
Subjects
Details
- ISSN :
- 14712164
- Volume :
- 15
- Database :
- OpenAIRE
- Journal :
- BMC genomics
- Accession number :
- edsair.doi.dedup.....0f6ba7eb833499abd0453628567e402f
- Full Text :
- https://doi.org/10.1186/1471-2164-15-881