Back to Search
Start Over
RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures
- Source :
- Nucleic Acids Research, SEDICI (UNLP), Universidad Nacional de La Plata, instacron:UNLP, Nucleic Acids Research, Oxford University Press, 2020, ⟨10.1093/nar/gkaa1097⟩
- Publication Year :
- 2020
- Publisher :
- Oxford University Press, 2020.
-
Abstract
- The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.<br />Facultad de Ciencias Exactas<br />Instituto de Biotecnologia y Biologia Molecular
- Subjects :
- Repetitive Sequences, Amino Acid
AcademicSubjects/SCI00010
Biología
Statistics as Topic
Protein Data Bank (RCSB PDB)
Computational biology
Biology
Repetitive Sequences
Gene Ontology
HEK293 Cells
HeLa Cells
Humans
Proteins
Reproducibility of Results
User-Computer Interface
Databases, Protein
Tandem Repeat Sequences
Databases
03 medical and health sciences
Annotation
Protein structure
Similarity (network science)
Tandem repeat
Genetics
Database Issue
Ciencias Exactas
database
030304 developmental biology
0303 health sciences
Hierarchy (mathematics)
Protein
030302 biochemistry & molecular biology
computer.file_format
Protein Data Bank
Class (biology)
proteins
Amino Acid
ComputingMethodologies_PATTERNRECOGNITION
classification
protein tandem repeat structures
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
computer
Subjects
Details
- Language :
- English
- ISSN :
- 13624962 and 03051048
- Volume :
- 49
- Issue :
- D1
- Database :
- OpenAIRE
- Journal :
- Nucleic Acids Research
- Accession number :
- edsair.doi.dedup.....588bf86faec4b1890cf5cf27d81eb86a