Back to Search
Start Over
Structural variation analysis with strobe reads
- Source :
- Bioinformatics. 26:1291-1298
- Publication Year :
- 2010
- Publisher :
- Oxford University Press (OUP), 2010.
-
Abstract
- Motivation: Structural variation including deletions, duplications and rearrangements of DNA sequence are an important contributor to genome variation in many organisms. In human, many structural variants are found in complex and highly repetitive regions of the genome making their identification difficult. A new sequencing technology called strobe sequencing generates strobe reads containing multiple subreads from a single contiguous fragment of DNA. Strobe reads thus generalize the concept of paired reads, or mate pairs, that have been routinely used for structural variant detection. Strobe sequencing holds promise for unraveling complex variants that have been difficult to characterize with current sequencing technologies.Results: We introduce an algorithm for identification of structural variants using strobe sequencing data. We consider strobe reads from a test genome that have multiple possible alignments to a reference genome due to sequencing errors and/or repetitive sequences in the reference. We formulate the combinatorial optimization problem of finding the minimum number of structural variants in the test genome that are consistent with these alignments. We solve this problem using an integer linear program. Using simulated strobe sequencing data, we show that our algorithm has better sensitivity and specificity than paired read approaches for structural variation identification.Contact: braphael@brown.edu
- Subjects :
- Statistics and Probability
Hybrid genome assembly
Computational biology
Biology
Biochemistry
Genome
DNA sequencing
Deep sequencing
Structural variation
Databases, Genetic
Humans
Molecular Biology
Genetics
Base Sequence
Genome, Human
Genetic Variation
DNA
Genomics
Sequence Analysis, DNA
Computer Science Applications
Computational Mathematics
Identification (information)
Computational Theory and Mathematics
Sequence Alignment
Reference genome
Integer (computer science)
Subjects
Details
- ISSN :
- 13674811 and 13674803
- Volume :
- 26
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....2fe5c1333a3a74ee020d2799adfd0106