Back to Search Start Over

A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome

Authors :
Mark Yandell
Sima Misra
Adina M. Bailey
Martha Evans-Holm
Gerald M. Rubin
ShengQiang Shu
Susan E. Celniker
Colin Wiel
Source :
Proceedings of the National Academy of Sciences. 102:1566-1571
Publication Year :
2005
Publisher :
Proceedings of the National Academy of Sciences, 2005.

Abstract

Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various sources by sequencing RT-PCR products to confirm gene structures. Our data provide experimental evidence for 122 protein-coding genes. Our analyses suggest that the entire collection of predictions contains only ≈700 additional protein-coding genes. Although we cannot rule out the discovery of genes with unusual features that make them refractory to existing methods, our results suggest that the D. melanogaster genome contains ≈14,000 protein-coding genes.

Details

ISSN :
10916490 and 00278424
Volume :
102
Database :
OpenAIRE
Journal :
Proceedings of the National Academy of Sciences
Accession number :
edsair.doi.dedup.....bcc37994b60d7881e1e2f0eee6ac21bc
Full Text :
https://doi.org/10.1073/pnas.0409421102