1. [Untitled]
- Author
-
Thomas R. Gingeras, Suzanna E. Lewis, Josep F. Abril, Tim Hubbard, Vladimir B. Bajic, Paul Flicek, Alexandre Reymond, Stylianos E. Antonarakis, Robert Castelo, Eduardo Eyras, Roderic Guigó, Catherine Ucla, Jennifer Harrow, Ewan Birney, Martin G. Reese, Michael Ashburner, and Julien Lagarde
- Subjects
Genetics ,0303 health sciences ,GENCODE ,Alternative splicing ,Genomics ,Genome project ,Computational biology ,Biology ,ENCODE ,Human genetics ,03 medical and health sciences ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Human genome ,Gene ,030304 developmental biology - Abstract
Background: We present the results of EGASP, a community experiment to assess the state-ofthe-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a ‘reference set’ of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. Results: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of
- Published
- 2006
- Full Text
- View/download PDF