1. HighSSR: high-throughput SSR characterization and locus development from next-gen sequencing data
- Author
-
Peter Houde, Alexander Churbanov, Rachael Ryan, Donovan Bailey, Nabeeh A. Hasan, Haofeng Chen, and Brook Milligan
- Subjects
Genetic Markers ,Statistics and Probability ,Sequencing data ,Locus (genetics) ,Computational biology ,Biology ,Biochemistry ,Genome ,Species Specificity ,Animals ,Genomic library ,Molecular Biology ,Genotyping ,Gene Library ,Genetics ,Polymorphism, Genetic ,Fungi ,High-Throughput Nucleotide Sequencing ,Plants ,Amplicon ,Original Papers ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Genetic marker ,Multigene Family ,Microsatellite ,Butterflies ,Algorithms ,Microsatellite Repeats - Abstract
Motivation: Microsatellites are among the most useful genetic markers in population biology. High-throughput sequencing of microsatellite-enriched libraries dramatically expedites the traditional process of screening recombinant libraries for microsatellite markers. However, sorting through millions of reads to distill high-quality polymorphic markers requires special algorithms tailored to tolerate sequencing errors in locus reconstruction, distinguish paralogous loci, rarify raw reads originating from the same amplicon and sort out various artificial fragments resulting from recombination or concatenation of auxiliary adapters. Existing programs warrant improvement. Results: We describe a microsatellite prediction framework named HighSSR for microsatellite genotyping based on high-throughput sequencing. We demonstrate the utility of HighSSR in comparison to Roche gsAssembler on two Roche 454 GS FLX runs. The majority of the HighSSR-assembled loci were reliably mapped against model organism reference genomes. HighSSR demultiplexes pooled libraries, assesses locus polymorphism and implements Primer3 for the design of PCR primers flanking polymorphic microsatellite loci. As sequencing costs drop and permit the analysis of all project samples on next-generation platforms, this framework can also be used for direct simple sequence repeats genotyping. Availability: http://code.google.com/p/highssr/ Contact: alexander@big.ac.cn Supplementary Information: Supplementary data are available at Bioinformatics online.
- Published
- 2012
- Full Text
- View/download PDF