Back to Search
Start Over
Amplification biases and consistent recovery of loci in a double-digest RAD-seq protocol.
- Source :
-
PloS one [PLoS One] 2014 Sep 04; Vol. 9 (9), pp. e106713. Date of Electronic Publication: 2014 Sep 04 (Print Publication: 2014). - Publication Year :
- 2014
-
Abstract
- A growing variety of "genotype-by-sequencing" (GBS) methods use restriction enzymes and high throughput DNA sequencing to generate data for a subset of genomic loci, allowing the simultaneous discovery and genotyping of thousands of polymorphisms in a set of multiplexed samples. We evaluated a "double-digest" restriction-site associated DNA sequencing (ddRAD-seq) protocol by 1) comparing results for a zebra finch (Taeniopygia guttata) sample with in silico predictions from the zebra finch reference genome; 2) assessing data quality for a population sample of indigobirds (Vidua spp.); and 3) testing for consistent recovery of loci across multiple samples and sequencing runs. Comparison with in silico predictions revealed that 1) over 90% of predicted, single-copy loci in our targeted size range (178-328 bp) were recovered; 2) short restriction fragments (38-178 bp) were carried through the size selection step and sequenced at appreciable depth, generating unexpected but nonetheless useful data; 3) amplification bias favored shorter, GC-rich fragments, contributing to among locus variation in sequencing depth that was strongly correlated across samples; 4) our use of restriction enzymes with a GC-rich recognition sequence resulted in an up to four-fold overrepresentation of GC-rich portions of the genome; and 5) star activity (i.e., non-specific cutting) resulted in thousands of "extra" loci sequenced at low depth. Results for three species of indigobirds show that a common set of thousands of loci can be consistently recovered across both individual samples and sequencing runs. In a run with 46 samples, we genotyped 5,996 loci in all individuals and 9,833 loci in 42 or more individuals, resulting in <1% missing data for the larger data set. We compare our approach to similar methods and discuss the range of factors (fragment library preparation, natural genetic variation, bioinformatics) influencing the recovery of a consistent set of loci among samples.
- Subjects :
- Animals
Base Composition
Bias
Chromosome Mapping
Computational Biology methods
Genetic Loci
High-Throughput Nucleotide Sequencing methods
Humans
Nucleic Acid Amplification Techniques methods
Polymorphism, Single Nucleotide
Computational Biology statistics & numerical data
Finches genetics
Genome
High-Throughput Nucleotide Sequencing statistics & numerical data
Nucleic Acid Amplification Techniques statistics & numerical data
Passeriformes genetics
Subjects
Details
- Language :
- English
- ISSN :
- 1932-6203
- Volume :
- 9
- Issue :
- 9
- Database :
- MEDLINE
- Journal :
- PloS one
- Publication Type :
- Academic Journal
- Accession number :
- 25188270
- Full Text :
- https://doi.org/10.1371/journal.pone.0106713