1. A roadmap for the functional annotation of protein families: a community perspective
- Author
-
Valérie de Crécy-lagard, Rocio Amorin de Hegedus, Cecilia Arighi, Jill Babor, Alex Bateman, Ian Blaby, Crysten Blaby-Haas, Alan J Bridge, Stephen K Burley, Stacey Cleveland, Lucy J Colwell, Ana Conesa, Christian Dallago, Antoine Danchin, Anita de Waard, Adam Deutschbauer, Raquel Dias, Yousong Ding, Gang Fang, Iddo Friedberg, John Gerlt, Joshua Goldford, Mark Gorelik, Benjamin M Gyori, Christopher Henry, Geoffrey Hutinet, Marshall Jaroch, Peter D Karp, Liudmyla Kondratova, Zhiyong Lu, Aron Marchler-Bauer, Maria-Jesus Martin, Claire McWhite, Gaurav D Moghe, Paul Monaghan, Anne Morgat, Christopher J Mungall, Darren A Natale, William C Nelson, Seán O’Donoghue, Christine Orengo, Katherine H O’Toole, Predrag Radivojac, Colbie Reed, Richard J Roberts, Dmitri Rodionov, Irina A Rodionova, Jeffrey D Rudolf, Lana Saleh, Gloria Sheynkman, Francoise Thibaud-Nissen, Paul D Thomas, Peter Uetz, David Vallenet, Erica Watson Carter, Peter R Weigele, Valerie Wood, Elisha M Wood-Charlson, Jin Xu, de Crécy-Lagard, Valérie [0000-0002-9955-3785], Arighi, Cecilia [0000-0002-0803-4817], Bateman, Alex [0000-0002-6982-4660], Blaby, Ian [0000-0002-1631-3154], Bridge, Alan J [0000-0003-2148-9135], Burley, Stephen K [0000-0002-2487-9713], Conesa, Ana [0000-0001-9597-311X], Dallago, Christian [0000-0003-4650-6181], Danchin, Antoine [0000-0002-6350-5001], de Waard, Anita [0000-0002-9034-4119], Ding, Yousong [0000-0001-8610-0659], Friedberg, Iddo [0000-0002-1789-8000], Gyori, Benjamin M [0000-0001-9439-5346], Lu, Zhiyong [0000-0001-9998-916X], Mungall, Christopher J [0000-0002-6601-2165], Radivojac, Predrag [0000-0002-6769-0793], Rodionova, Irina A [0000-0002-6500-2758], Sheynkman, Gloria [0000-0002-4223-9947], Thomas, Paul D [0000-0002-9074-3507], Vallenet, David [0000-0001-6648-0332], Weigele, Peter R [0000-0003-3696-4541], Wood, Valerie [0000-0001-6330-7526], Apollo - University of Cambridge Repository, National Science Foundation (US), National Institutes of Health (US), and National Library of Medicine (US)
- Subjects
Genome ,Base Sequence ,Human Genome ,Computational Biology ,Proteins ,Molecular Sequence Annotation ,Genomics ,General Biochemistry, Genetics and Molecular Biology ,Data Format ,Library and Information Studies ,Genetics ,Generic health relevance ,General Agricultural and Biological Sciences ,Information Systems - Abstract
Valérie de Crécy-Lagard: et al., Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward., National Science Foundation (grant MCB-2129768) to V.d.C.-L; National Institutes of Health Intramural Research Program, National Library of Medicine (Z.L.).
- Published
- 2022
- Full Text
- View/download PDF