1. RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures
- Author
-
Pablo Lorenzano Menna, Martina Bevilacqua, Mariane Gonçalves Kulik, Alexander Miguel Monzon, Lisanna Paladin, José Luis López, Martin Gonzalez Buitron, Javier Rios, Marco Necci, Sara Errigo, Layla Hirsh, Ivan Mičetić, Juliet F. Nilsson, Andrey V. Kajava, María Silvina Fornasari, Antonio Lagares, Damiano Piovesan, Sebastian Fernandez-Alberti, Maia Diana Eliana Cabrera, Gustavo Parisi, María Laura Fabre, Miguel A. Andrade-Navarro, Silvio C. E. Tosatto, Centre de recherche en Biologie Cellulaire (CRBM), and Université Montpellier 2 - Sciences et Techniques (UM2)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Université Montpellier 1 (UM1)
- Subjects
Repetitive Sequences, Amino Acid ,AcademicSubjects/SCI00010 ,Biología ,Statistics as Topic ,Protein Data Bank (RCSB PDB) ,Computational biology ,Biology ,Repetitive Sequences ,Gene Ontology ,HEK293 Cells ,HeLa Cells ,Humans ,Proteins ,Reproducibility of Results ,User-Computer Interface ,Databases, Protein ,Tandem Repeat Sequences ,Databases ,03 medical and health sciences ,Annotation ,Protein structure ,Similarity (network science) ,Tandem repeat ,Genetics ,Database Issue ,Ciencias Exactas ,database ,030304 developmental biology ,0303 health sciences ,Hierarchy (mathematics) ,Protein ,030302 biochemistry & molecular biology ,computer.file_format ,Protein Data Bank ,Class (biology) ,proteins ,Amino Acid ,ComputingMethodologies_PATTERNRECOGNITION ,classification ,protein tandem repeat structures ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,computer - Abstract
The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures., Facultad de Ciencias Exactas, Instituto de Biotecnologia y Biologia Molecular
- Published
- 2020