1. The COMBREX project: design, methodology, and initial results
- Author
-
Kimmen Sjölander, Jyotsna Guleria, Donald J. Ferguson, Giovanni Gadda, John F. Hunt, Almaz Maksad, Maria Jesus Martin, Revonda M. Pokrzywa, Charles DeLisi, Linda Columbus, David Horn, John Tate, Dieter Söll, Rajeswari Swaminathan, Jeffrey H Miller, Lina L. Faller, Alexander F. Yakunin, Bernhard O. Palsson, Martin Steffen, Granger G. Sutton, Daniel Segrè, Kenneth E. Rudd, Krista Rochussen, Peter D. Karp, Mark G. McGettrick, Alexey Fomenkov, Han-Pil Choi, Ramana Madupu, Robert Blumenthal, Manuel Ferrer, Jim C. Spain, Claire O'Donovan, Russell Greiner, J. Martin Bollinger, Ami Levy-Moonshine, Richard J. Roberts, William Klimke, Shuang-yong Xu, Kevin R. Tao, Yi Chien Chang, Caitlin Monahan, Julien Gobeill, Germán Plata, Varun Mazumdar, Aaron T. Setterdahl, Dmitri Tchigvintsev, Genevieve Housman, Jie Hu, John Rachlin, Woo Suk Chang, Ashok S. Bhagwat, Michael Y. Galperin, Irina A. Rodionova, Zhenjun Hu, Lais Osmani, Carsten Krebs, Dennis Vitkup, Brian P. Anton, Daniel H. Haft, Iddo Friedberg, Simon Kasif, Steven E. Brenner, Steven L. Salzberg, Stanley Letovsky, Niels Klitgord, Dana Macelis, Alex Bateman, Richard D. Morgan, Peter Brown, Valérie de Crécy-Lagard, Andrei L. Osterman, Benjamin L. Allen, Dmitry A. Rodionov, Patrick Ruch, and National Institute of General Medical Sciences (US)
- Subjects
QH301-705.5 ,Biological Data Management ,Biology ,Biochemistry ,Microbiology ,Genomic databases ,General Biochemistry, Genetics and Molecular Biology ,Molecular Genetics ,03 medical and health sciences ,Model Organisms ,0302 clinical medicine ,Community Page ,Genetics ,Genome Databases ,Humans ,Genome Sequencing ,Biology (General) ,Microbial Pathogens ,030304 developmental biology ,Escherichia Coli ,0303 health sciences ,General Immunology and Microbiology ,General Neuroscience ,Computational Biology ,Experimental data ,Genomics ,Genome project ,Comparative Genomics ,Models, Theoretical ,Data science ,Enzymes ,Functional Genomics ,3. Good health ,Microbial Genes ,Prokaryotic Models ,Gene Function ,General Agricultural and Biological Sciences ,Sequence Analysis ,030217 neurology & neurosurgery ,Project design - Abstract
© 2013 Brian P. et al., Prior to the “genomic era,” when the acquisition of DNA sequence involved significant labor and expense, the sequencing of genes was strongly linked to the experimental characterization of their products. Sequencing at that time directly resulted from the need to understand an experimentally determined phenotype or biochemical activity. Now that DNA sequencing has become orders of magnitude faster and less expensive, focus has shifted to sequencing entire genomes. Since biochemistry and genetics have not, by and large, enjoyed the same improvement of scale, public sequence repositories now predominantly contain putative protein sequences for which there is no direct experimental evidence of function. Computational approaches attempt to leverage evidence associated with the ever-smaller fraction of experimentally analyzed proteins to predict function for these putative proteins. Maximizing our understanding of function over the universe of proteins in toto requires not only robust computational methods of inference but also a judicious allocation of experimental resources, focusing on proteins whose experimental characterization will maximize the number and accuracy of follow-on predictions., COMBREX is funded by a GO grant from the National Institute of General Medical Sciences (NIGMS) (1RC2GM092602-01).
- Published
- 2014