Back to Search Start Over

Annotating Macromolecular Complexes in the Protein Data Bank: Improving the FAIRness of Structure Data

Authors :
Sri Devan Appasamy
John Berrisford
Romana Gaborova
Sreenath Nair
Stephen Anyango
Sergei Grudinin
Mandar Deshpande
David Armstrong
Ivanna Pidruchna
Joseph I. J. Ellaway
Grisell Díaz Leines
Deepti Gupta
Deborah Harrus
Mihaly Varadi
Sameer Velankar
Source :
Scientific Data, Vol 10, Iss 1, Pp 1-13 (2023)
Publication Year :
2023
Publisher :
Nature Portfolio, 2023.

Abstract

Abstract Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating and modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository for experimentally determined structures of macromolecules. Structural data in the PDB offer valuable insights into the dynamics, conformation, and functional states of biological assemblies. However, the current annotation practices lack standardised naming conventions for assemblies in the PDB, complicating the identification of instances representing the same assembly. In this study, we introduce a method leveraging resources external to PDB, such as the Complex Portal, UniProt and Gene Ontology, to describe assemblies and contextualise them within their biological settings accurately. Employing the proposed approach, we assigned standard names to over 90% of unique assemblies in the PDB and provided persistent identifiers for each assembly. This standardisation of assembly data enhances the PDB, facilitating a deeper understanding of macromolecular complexes. Furthermore, the data standardisation improves the PDB’s FAIR attributes, fostering more effective basic and translational research and scientific education.

Subjects

Subjects :
Science

Details

Language :
English
ISSN :
20524463
Volume :
10
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Scientific Data
Publication Type :
Academic Journal
Accession number :
edsdoj.0756c17794b4f30a37d6a65e24cb126
Document Type :
article
Full Text :
https://doi.org/10.1038/s41597-023-02778-9