Back to Search Start Over

Efficiently detecting polymorphisms during the fragment assembly process

Authors :
Daniel Fasulo
Clark M. Mobarry
Ian M. Dew
Aaron L. Halpern
Source :
ISMB
Publication Year :
2002
Publisher :
Oxford University Press (OUP), 2002.

Abstract

Motivation: Current genomic sequence assemblers assume that the input data is derived from a single, homogeneous source. However, recent whole-genome shotgun sequencing projects have violated this assumption, resulting in input fragments covering the same region of the genome whose sequences differ due to polymorphic variation in the population. While single-nucleotide polymorphisms (SNPs) do not pose a significant problem to state-of-the-art assembly methods, these methods do not handle insertion/deletion (indel) polymorphisms of more than a few bases. Results: This paper describes an efficient method for detecting sequence discrepencies due to polymorphism that avoids resorting to global use of more costly, less stringent affine sequence alignments. Instead, the algorithm uses graph-based methods to determine the small set of fragments involved in each polymorphism and performs more sophisticated alignments only among fragments in that set. Results from the incorporation of this method into the Celera Assembler are reported for the D. melanogaster, H. sapiens, and M. musculus genomes. Availability: The method described herein does not constitute a stand-alone software application, but is laid out in sufficient detail to be implemented as a component of any genomic sequence assembler. Contact: daniel.fasulo@celera.com Keywords: whole-genome assembly; shotgun sequencing; polymorphism.

Details

ISSN :
13674811 and 13674803
Volume :
18
Database :
OpenAIRE
Journal :
Bioinformatics
Accession number :
edsair.doi.dedup.....665945127dd158a7fc06beff46a698be
Full Text :
https://doi.org/10.1093/bioinformatics/18.suppl_1.s294