1. FastGT: an alignment-free method for calling common SNVs directly from raw sequencing reads.
- Author
-
Pajuste FD, Kaplinski L, Möls M, Puurand T, Lepamets M, and Remm M
- Subjects
- Bayes Theorem, Benchmarking, Genotype, High-Throughput Nucleotide Sequencing, Humans, Reproducibility of Results, Sequence Analysis, DNA statistics & numerical data, Algorithms, Genome, Human, Polymorphism, Single Nucleotide, Sequence Analysis, DNA methods, Software
- Abstract
We have developed a computational method that counts the frequencies of unique k-mers in FASTQ-formatted genome data and uses this information to infer the genotypes of known variants. FastGT can detect the variants in a 30x genome in less than 1 hour using ordinary low-cost server hardware. The overall concordance with the genotypes of two Illumina "Platinum" genomes is 99.96%, and the concordance with the genotypes of the Illumina HumanOmniExpress is 99.82%. Our method provides k-mer database that can be used for the simultaneous genotyping of approximately 30 million single nucleotide variants (SNVs), including >23,000 SNVs from Y chromosome. The source code of FastGT software is available at GitHub (https://github.com/bioinfo-ut/GenomeTester4/).
- Published
- 2017
- Full Text
- View/download PDF