1. GENCODE 2021
- Author
-
Fabio C. P. Navarro, Jonathan M. Mudge, S. Mohanan, Adam Frankish, Joel Armstrong, Tiago Grego, Irwin Jungreis, Roderic Guigó, Jinrui Xu, Benedict Paten, Cristina Sisu, Daniel R. Zerbino, Julien Lagarde, Mark Diekhans, José M. González, Michael L. Tress, E. Stapleton, Osagie G. Izuogu, Mark Gerstein, Ian T. Fiddes, Toby Hunt, Sarah Donaldson, Marie Marthe Suner, Fernando Pozo, Andrew D. Yates, S. Carbonell Sala, T. Di Domenico, Matthew Hardy, Barbara Uszczynska-Ratajczak, Fiona Cunningham, Andrew Berry, Anne Parker, Laura Martinez, Alexandra Bignell, Bianca M. Schmitt, Yan Zhang, Jane E. Loveland, Baikang Pei, Jyoti S. Choudhary, F. C. Riera, Paul R. Muir, C. Garcia Giron, Tim Hubbard, Fergal J. Martin, Rory Johnson, Magali Ruffier, If Barnes, James C. Wright, I. Sycheva, Manolis Kellis, Carles Boix, Thibaut Hourlier, Paul Flicek, Maxim Y Wolf, Y. T. Yang, and Kerstin Howe
- Subjects
Transcription, Genetic ,AcademicSubjects/SCI00010 ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,610 Medicine & health ,Computational biology ,Genome browser ,Biology ,Genome ,Mice ,03 medical and health sciences ,Annotation ,0302 clinical medicine ,Databases, Genetic ,Genetics ,Database Issue ,Animals ,Humans ,Ensembl ,Epidemics ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,030304 developmental biology ,Internet ,0303 health sciences ,SARS-CoV-2 ,GENCODE ,COVID-19 ,Computational Biology ,Molecular Sequence Annotation ,Genomics ,ComputingMethodologies_PATTERNRECOGNITION ,Genome Biology ,RNA, Long Noncoding ,Pseudogenes ,030217 neurology & neurosurgery ,Reference genome - Abstract
© The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org. National Human Genome Research Institute of the National Institutes of Health [U41HG007234]; the content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health; Wellcome Trust [WT108749/Z/15/Z, WT200990/Z/16/Z]; European Molecular Biology Laboratory; Swiss National Science Foundation through the National Center of Competence in Research ‘RNA & Disease’ (to R.J.); Medical Faculty of the University of Bern (to R.J). Funding for open access charge: National Institutes of Health.
- Published
- 2020