1. proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes.
- Author
-
Fullam A, Letunic I, Schmidt TSB, Ducarmon QR, Karcher N, Khedkar S, Kuhn M, Larralde M, Maistrenko OM, Malfertheiner L, Milanese A, Rodrigues JFM, Sanchis-López C, Schudoma C, Szklarczyk D, Sunagawa S, Zeller G, Huerta-Cepas J, von Mering C, Bork P, and Mende DR
- Subjects
- Databases, Genetic, Genomics, Molecular Sequence Annotation, Bacteria classification, Bacteria genetics, Genome, Prokaryotic Cells
- Abstract
The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/., (© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2023
- Full Text
- View/download PDF