1. An assessment of the sequence gaps: unfinished business in a finished human genome
- Author
-
Xinwei She, Royden A. Clark, and Evan E. Eichler
- Subjects
Genetics ,Genome evolution ,Internet ,Polymorphism, Genetic ,Euchromatin ,Databases, Factual ,Genome, Human ,Genome project ,Gene Annotation ,Computational biology ,Sequence Analysis, DNA ,Biology ,Genome ,Gene Duplication ,Heterochromatin ,Humans ,Human genome ,Molecular Biology ,Sequence Alignment ,Genetics (clinical) ,Segmental duplication ,Sequence (medicine) - Abstract
Biological research increasingly depends on 'finished' genome sequences. Deducing what is absent from these sequences is not trivial. More than 99% of the euchromatic portion of the human genome is now represented as a high-quality finished sequence with each base ordered and oriented. However, two principal types of gap remain: heterochromatic (estimated to be ∼200 Mb) and euchromatic (23.0 Mb) gaps. Here, we use various global sources of data to help understand the nature of the gaps in the finished human genome. Not all gaps are recalcitrant to subcloning, nor are most heterochromatic. The presence of recent segmental duplications is the most important predictor of gap location in euchromatic sequences. The resolution of these regions remains an important challenge for the completion of the human genome, gene annotation and SNP assignment.
- Published
- 2004