163 results on '"Sameer Velankar"'
Search Results
52. The EBI enzyme portal.
- Author
-
Rafael Alcántara, Joseph Onwubiko, Hong Cao, Paula de Matos, Jennifer A. Cham, Julius O. B. Jacobsen, Gemma L. Holliday, Julia D. Fischer, Syed Asad Rahman, Bijay Jassal, Mickael Goujon, Francis Rowland, Sameer Velankar, Rodrigo Lopez, John P. Overington, Gerard J. Kleywegt, Henning Hermjakob, Claire O'Donovan, Maria Jesus Martin, Janet M. Thornton, and Christoph Steinbeck
- Published
- 2013
- Full Text
- View/download PDF
53. Patterns of database citation in articles and patents indicate long-term scientific and industry value of biological data resources [version 1; referees: 3 approved]
- Author
-
David Bousfield, Johanna McEntyre, Sameer Velankar, George Papadatos, Alex Bateman, Guy Cochrane, Jee-Hyub Kim, Florian Graef, Vid Vartak, Blaise Alako, and Niklas Blomberg
- Subjects
Research Article ,Articles ,Data Sharing ,Publishing & Peer Review ,Data citations ,Data reuse ,Data repositories ,Data archiving ,Open data ,Bibliometrics ,Patent analysis ,Research impact - Abstract
Data from open access biomolecular data resources, such as the European Nucleotide Archive and the Protein Data Bank are extensively reused within life science research for comparative studies, method development and to derive new scientific insights. Indicators that estimate the extent and utility of such secondary use of research data need to reflect this complex and highly variable data usage. By linking open access scientific literature, via Europe PubMedCentral, to the metadata in biological data resources we separate data citations associated with a deposition statement from citations that capture the subsequent, long-term, reuse of data in academia and industry. We extend this analysis to begin to investigate citations of biomolecular resources in patent documents. We find citations in more than 8,000 patents from 2014, demonstrating substantial use and an important role for data resources in defining biological concepts in granted patents to both academic and industrial innovators. Combined together our results indicate that the citation patterns in biomedical literature and patents vary, not only due to citation practice but also according to the data resource cited. The results guard against the use of simple metrics such as citation counts and show that indicators of data use must not only take into account citations within the biomedical literature but also include reuse of data in industry and other parts of society by including patents and other scientific and technical documents such as guidelines, reports and grant applications.
- Published
- 2016
- Full Text
- View/download PDF
54. The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies
- Author
-
Mihaly Varadi, Vincent Zoete, Sameer Velankar, Neeladri Sen, Antoine Daina, Shoshana J. Wodak, Christine A. Orengo, and Vaishali P Waman
- Subjects
AcademicSubjects/SCI01060 ,Coronavirus disease 2019 (COVID-19) ,Protein Conformation ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,protein 3D structures ,Disease ,Computational biology ,Biology ,Antiviral Agents ,Viral Proteins ,03 medical and health sciences ,Structural bioinformatics ,Protein structure ,therapeutics ,Humans ,Molecular Biology ,Host protein ,Method Review ,030304 developmental biology ,0303 health sciences ,SARS-CoV-2 ,030302 biochemistry & molecular biology ,COVID-19 ,Computational Biology ,structural bioinformatics ,computer.file_format ,Protein Data Bank ,structure prediction ,mutation/variation ,COVID-19 Drug Treatment ,3. Good health ,Structural biology ,computer ,Information Systems - Abstract
SARS-CoV-2 is the causative agent of COVID-19, the ongoing global pandemic. It has posed a worldwide challenge to human health as no effective treatment is currently available to combat the disease. Its severity has led to unprecedented collaborative initiatives for therapeutic solutions against COVID-19. Studies resorting to structure-based drug design for COVID-19 are plethoric and show good promise. Structural biology provides key insights into 3D structures, critical residues/mutations in SARS-CoV-2 proteins, implicated in infectivity, molecular recognition and susceptibility to a broad range of host species. The detailed understanding of viral proteins and their complexes with host receptors and candidate epitope/lead compounds is the key to developing a structure-guided therapeutic design.Since the discovery of SARS-CoV-2, several structures of its proteins have been determined experimentally at an unprecedented speed and deposited in the Protein Data Bank. Further, specialized structural bioinformatics tools and resources have been developed for theoretical models, data on protein dynamics from computer simulations, impact of variants/mutations and molecular therapeutics.Here, we provide an overview of ongoing efforts on developing structural bioinformatics tools and resources for COVID-19 research. We also discuss the impact of these resources and structure-based studies, to understand various aspects of SARS-CoV-2 infection and therapeutic development. These include (i) understanding differences between SARS-CoV-2 and SARS-CoV, leading to increased infectivity of SARS-CoV-2, (ii) deciphering key residues in the SARS-CoV-2 involved in receptor–antibody recognition, (iii) analysis of variants in host proteins that affect host susceptibility to infection and (iv) analyses facilitating structure-based drug and vaccine design against SARS-CoV-2.
- Published
- 2020
55. High-performance macromolecular data delivery and visualization for the web
- Author
-
Stephen K. Burley, Karel Berka, Jaroslav Koča, Radka Svobodová, Alexander S. Rose, Sameer Velankar, and David Sehnal
- Subjects
genetic structures ,Exploit ,Computer science ,web-based ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,Resource (project management) ,Structural Biology ,data delivery ,Web application ,macromolecules ,visualization ,030304 developmental biology ,Structure (mathematical logic) ,0303 health sciences ,Focus (computing) ,browser-based ,business.industry ,Experimental data ,Data science ,Visualization ,Web service ,Ccp4 ,business ,computer ,030217 neurology & neurosurgery - Abstract
This article provides a survey of available web services and tools for data delivery and visualization of macromolecular structures., Biomacromolecular structural data make up a vital and crucial scientific resource that has grown not only in terms of its amount but also in its size and complexity. Furthermore, these data are accompanied by large and increasing amounts of experimental data. Additionally, the macromolecular data are enriched with value-added annotations describing their biological, physicochemical and structural properties. Today, the scientific community requires fast and fully interactive web visualization to exploit this complex structural information. This article provides a survey of the available cutting-edge web services that address this challenge. Specifically, it focuses on data-delivery problems, discusses the visualization of a single structure, including experimental data and annotations, and concludes with a focus on the results of molecular-dynamics simulations and the visualization of structural ensembles.
- Published
- 2020
56. PDBe: Protein Data Bank in Europe.
- Author
-
Sameer Velankar, Younes Alhroub, Christoph Best, Ségolène Caboche, Matthew J. Conroy, Jose M. Dana, Manuel A. Fernandez Montecelo, Glen van Ginkel, Adel Golovin, Swanand P. Gore, Aleksandras Gutmanas, Pauline Haslam, Pieter M. S. Hendrickx, Egon Heuson, Miriam Hirshberg, Melford John, Ingvar C. Lagerstedt, Saqib Mir, Laurence E. Newman, Thomas J. Oldfield, Ardan Patwardhan, Luana Rinaldi, Gaurav Sahni, Eduardo Sanz-García, Sanchayita Sen, Robert A. Slowley, Antonio Suarez-Uruena, G. J. Swaminathan, Martyn F. Symmons, Wim F. Vranken, Michael E. Wainwright, and Gerard J. Kleywegt
- Published
- 2012
- Full Text
- View/download PDF
57. EMDataBank.org: unified data resource for CryoEM.
- Author
-
Catherine L. Lawson, Matthew L. Baker, Christoph Best, Chunxiao Bi, Matthew T. Dougherty, Powei Feng, Glen van Ginkel, Batsal Devkota, Ingvar C. Lagerstedt, Steven J. Ludtke, Richard H. Newman, Thomas J. Oldfield, Ian Rees, Gaurav Sahni, Raul Sala, Sameer Velankar, Joe D. Warren, John D. Westbrook, Kim Henrick, Gerard J. Kleywegt, Helen M. Berman, and Wah Chiu
- Published
- 2011
- Full Text
- View/download PDF
58. PDBe: Protein Data Bank in Europe.
- Author
-
Sameer Velankar, Younes Alhroub, Anaëlle Alili, Christoph Best, Harry Boutselakis, Ségolène Caboche, Matthew J. Conroy, Jose M. Dana, Glen van Ginkel, Adel Golovin, Swanand P. Gore, Aleksandras Gutmanas, Pauline Haslam, Miriam Hirshberg, Melford John, Ingvar C. Lagerstedt, Saqib Mir, Laurence E. Newman, Thomas J. Oldfield, Christopher J. Penkett, Jorge Pineda-Castillo, Luana Rinaldi, Gaurav Sahni, Grégoire Sawka, Sanchayita Sen, Robert A. Slowley, Alan W. Sousa da Silva, Antonio Suarez-Uruena, Jawahar Swaminathan, Martyn F. Symmons, Wim F. Vranken, Michael E. Wainwright, and Gerard J. Kleywegt
- Published
- 2011
- Full Text
- View/download PDF
59. Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data.
- Author
-
Jasmine Young, John D. Westbrook, Zukang Feng, Ezra Peisach, Irina Persikova, Raul Sala, Sanchayita Sen, John M. Berrisford, Jawahar Swaminathan, Thomas J. Oldfield, Aleksandras Gutmanas, Reiko Igarashi, David R. Armstrong, Kumaran Baskaran, Li Chen 0022, Minyu Chen, Alice R. Clark, Luigi Di Costanzo, Dimitris Dimitropoulos, Guanghua Gao, Sutapa Ghosh, Swanand P. Gore, Vladimir Guranovic, Pieter M. S. Hendrickx, Brian P. Hudson, Yasuyo Ikegawa, Yumiko Kengaku, Catherine L. Lawson, Yuhe Liang, Lora Mak, Abhik Mukhopadhyay, Buvaneswari Coimbatore Narayanan, Kayoko Nishiyama, Ardan Patwardhan, Gaurav Sahni, Eduardo Sanz-García, Junko Sato, Monica Sekharan, Chenghua Shao, Oliver S. Smart, Lihua Tan, Glen van Ginkel, Huanwang Yang, Marina Zhuravleva, John L. Markley, Haruki Nakamura, Genji Kurisu, Gerard J. Kleywegt, Sameer Velankar, Helen M. Berman, and Stephen K. Burley
- Published
- 2018
- Full Text
- View/download PDF
60. PDBe aggregated API: programmatic access to an integrative knowledge graph of molecular structure data
- Author
-
David R. Armstrong, Nurul Nadzirin, Lukáš Pravda, Mihaly Varadi, Sameer Velankar, Stephen Anyango, Saqib Mir, John M. Berrisford, Sreenath Nair, and Aleksandras Gutmanas
- Subjects
Statistics and Probability ,Supplementary data ,Information retrieval ,Source code ,Graph database ,AcademicSubjects/SCI01060 ,Computer science ,media_common.quotation_subject ,computer.file_format ,Python (programming language) ,Protein Data Bank ,computer.software_genre ,Applications Notes ,Structural Bioinformatics ,Biochemistry ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Knowledge graph ,UniProt ,Molecular Biology ,computer ,media_common ,computer.programming_language - Abstract
Summary The PDBe aggregated API is an open-access and open-source RESTful API that provides programmatic access to a wealth of macromolecular structural data and their functional and biophysical annotations through 80+ API endpoints. The API is powered by the PDBe graph database (https://pdbe.org/graph-schema), an open-access integrative knowledge graph that can be used as a discovery tool to answer complex biological questions. Availability and implementation The PDBe aggregated API provides up-to-date access to the PDBe graph database, which has weekly releases with the latest data from the Protein Data Bank, integrated with updated annotations from UniProt, Pfam, CATH, SCOP and the PDBe-KB partner resources. The complete list of all the available API endpoints and their descriptions are available at https://pdbe.org/graph-api. The source code of the Python 3.6+ API application is publicly available at https://gitlab.ebi.ac.uk/pdbe-kb/services/pdbe-graph-api. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2021
61. PDBe-KB: collaboratively defining the biological context of structural data
- Author
-
David Bednar, Sucharita Dey, Emmanuel D. Levy, Natarajan Kannan, Bissan Al-Lazikani, Damiano Piovesan, Luis A Rodriguez, Sameer Velankar, Mihaly Varadi, Jan Stourac, Jaime Prilusky, Manjeet Kumar, Radoslav Krivak, Michael J.E. Sternberg, Juan Fernandez Recio, Daniel Zaidman, David R. Armstrong, Nathan J Rollins, Gulzar Singh, Jiri Damborsky, Dandan Xue, Stephen Anyango, Vivek Modi, Antonio Rosato, Christine A. Orengo, Valeria Putignano, Radka Svobodová, Alessia David, Debora S. Marks, Roland L. Dunbrack, Jose Ramon Macias, David Jakubec, Mark N. Wass, Luis Serrano, Silvio C. E. Tosatto, John M. Berrisford, Ahsan Tanweer, Sreenath Nair, Geoffrey J. Barton, Wim F. Vranken, Lukáš Pravda, Karel Berka, Stuart A McGowan, Janet M. Thornton, Nir London, Madhusudhan M Srivatsan, Lennart Martens, Atilio O Rausch, Toby J. Gibson, Pawel Rubach, Joanna I. Sulkowska, Petr Škoda, Gerardo Pepe, Nathalie Reuter, Natalia Tichshenko, Mandar Deshpande, Franca Fraternali, David Hoksza, Tom L. Blundell, R. Gonzalo Parra, Preeti Choudhary, José María Carazo, Claudia Andreini, Jake E McGreig, Leandro G Radusky, Thomas A. Hopf, Pathmanaban Ramasamy, Carlos Oscar S. Sorzano, Manuela Helmer-Citterich, Kelly P Brock, Nurul Nadzirin, Faculty of Sciences and Bioengineering Sciences, Department of Bio-engineering Sciences, Basic (bio-) Medical Sciences, Chemistry, Informatics and Applied Informatics, Barcelona Supercomputing Center, Biotechnology and Biological Sciences Research Council (UK), European Molecular Biology Laboratory, Ministry of Education, Youth and Sports (Czech Republic), European Commission, Research Foundation - Flanders, Fondazione Cassa di Risparmio di Firenze, Ministerio de Ciencia, Innovación y Universidades (España), Agencia Estatal de Investigación (España), and Wellcome Trust
- Subjects
Models, Molecular ,Informàtica::Aplicacions de la informàtica::Bioinformàtica [Àrees temàtiques de la UPC] ,Knowledge Base ,AcademicSubjects/SCI00010 ,Protein Conformation ,Knowledge Bases ,05 Environmental Sciences ,Context (language use) ,WEB SERVER ,PROTEIN ,PREDICT ,Biology ,structural biology ,database ,bioinorganic chemistry ,Macromolecular structure data ,03 medical and health sciences ,Structure-Activity Relationship ,User-Computer Interface ,Bioinformàtica ,Genetics ,Database Issue ,Humans ,Protein sequencing ,Phosphorylation ,Databases, Protein ,030304 developmental biology ,0303 health sciences ,Internet ,Settore BIO/11 ,030302 biochemistry & molecular biology ,Proteins ,Molecular Sequence Annotation ,Protein Data Bank (PDB) ,06 Biological Sciences ,Data science ,Europe ,Gene Ontology ,Macromolecules ,Mutation ,08 Information and Computing Sciences ,Protein Processing, Post-Translational ,Proteïnes ,Developmental Biology - Abstract
The Protein Data Bank in Europe – Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive., ELIXIR [IDP implementation study]; Biotechnology and Biological Sciences Research Council via the 3D-Gateway [BB/T01959X/1]; FunPDBe [BB/P024351/1]; European Molecular Biology Laboratory-European Bioinformatics Institute who supported this work; J.D. acknowledges support from the Ministry of Education, Youth and Sport of the Czech Republic [INBIO CZ.02.1.01/0.0/0.0/16_026/0008451]; R.S., K.B. and J.D. also acknowledge support from the Ministry of Education, Youth and Sport of the Czech Republic [ELIXIR-CZ LM2018131]; L.M. acknowledges support from the European Union's Horizon 2020 Programme (H2020-INFRAIA-2018-1) [823839]; Research Foundation Flanders (FWO) [G032816N, G042518N, G028821N]; W.V. acknowledges support from the Research Foundation Flanders (FWO) [G032816N, G028821N]; A.R. acknowledges support from the Fondazione Cassa Di Risparmio di Firenze [24316]; European Commission [101017567]; M.H.C. acknowledges the AIRC project to MHC [IG 23539]; J.F.-R. acknowledges support from the Spanish Ministry of Science and Innovation [PID2019-110167RB-I00]; N.R. acknowledges support from the Norwegian Research Council (Norges Forskningsråd) [288008]; E.D.L. acknowledges support from the European Union's Horizon 2020 research and innovation programme [819318]; M.J.E.S. acknowledges support from the Wellcome Trust [104955/Z/14/Z, 218242/Z/19/Z]. Funding for open access charge: Biotechnology and Biological Sciences Research Council grant [BB/T01959X/1]; Wellcome Trust [104955/Z/14/Z and 218242/Z/19/Z].
- Published
- 2022
62. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources
- Author
-
Mihaly Varadi, Sreenath Nair, Ian Sillitoe, Gerardo Tauriello, Stephen Anyango, Stefan Bienert, Clemente Borges, Mandar Deshpande, Tim Green, Demis Hassabis, Andras Hatos, Tamas Hegedus, Maarten L Hekkelman, Robbie Joosten, John Jumper, Agata Laydon, Dmitry Molodenskiy, Damiano Piovesan, Edoardo Salladini, Steven L Salzberg, Markus J Sommer, Martin Steinegger, Erzsebet Suhajda, Dmitri Svergun, Luiggi Tenorio-Ku, Silvio Tosatto, Kathryn Tunyasuvunakool, Andrew Mark Waterhouse, Augustin Žídek, Torsten Schwede, Christine Orengo, and Sameer Velankar
- Subjects
Metadata ,experimentally determined structures computationally predicted structures ,structural biology ,Records ,Health Informatics ,Computer Simulation ,bioinformatics ,Amino Acid Sequence ,Databases, Protein ,federated data network ,Computer Science Applications - Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
- Published
- 2022
63. Remediation of the protein data bank archive.
- Author
-
Kim Henrick, Zukang Feng, Wolfgang Bluhm, Dimitris Dimitropoulos, Jurgen F. Doreleijers, Shuchismita Dutta, Judith L. Flippen-Anderson, John M. C. Ionides, Chisa Kamada, Eugene Krissinel, Catherine L. Lawson, John L. Markley, Haruki Nakamura, Richard H. Newman, Yukiko Shimizu, Jawahar Swaminathan, Sameer Velankar, Jeramia Ory, Eldon L. Ulrich, Wim F. Vranken, John D. Westbrook, Reiko Yamashita, Huanwang Yang, Jasmine Young, Muhammed Yousufuddin, and Helen M. Berman
- Published
- 2008
- Full Text
- View/download PDF
64. Automatic annotation of protein residues in published papers
- Author
-
Chris Morris, Robert Firth, Aravind Venkatesan, Sameer Velankar, Johanna McEntyre, Francesco Talo, and Abhik Mukhopadhyay
- Subjects
0303 health sciences ,Information retrieval ,Computer science ,Publications ,Biophysics ,Proteins ,Molecular Sequence Annotation ,Condensed Matter Physics ,Method Communications ,Biochemistry ,Automation ,03 medical and health sciences ,Annotation ,ComputingMethodologies_PATTERNRECOGNITION ,0302 clinical medicine ,Structural Biology ,Mutation ,Genetics ,Amino Acids ,Software ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
This work presents an annotation tool that automatically locates mentions of particular amino-acid residues in published papers and identifies the protein concerned. These matches can be provided in context or in a searchable format in order for researchers to better use the existing and future literature.
- Published
- 2019
65. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment
- Author
-
Xiaoqin Zou, Théo Mauri, Hang Shi, Shaowen Zhu, Justas Dapkūnas, Yuanfei Sun, Didier Barradas-Bautista, Raphael A. G. Chaleil, Ragul Gowthaman, Sohee Kwon, Xianjin Xu, Zuzana Jandova, Genki Terashi, Ryota Ashizawa, Petras J. Kundrotas, Shuang Zhang, Tunde Aderinwale, Jian Liu, Sandor Vajda, Paul A. Bates, Jianlin Cheng, Daisuke Kihara, Luis A. Rodríguez-Lumbreras, Carlos A. Del Carpio Muñoz, Liming Qiu, Guillaume Brysbaert, Jorge Roel-Touris, Česlovas Venclovas, Tereza Clarence, Rui Yin, Amar Singh, Patryk A. Wesołowski, Rafał Ślusarz, Adam Liwo, Guangbo Yang, Agnieszka S. Karczyńska, Yoshiki Harada, Sergei Kotelnikov, Yuya Hanazono, Charlotte W. van Noort, Marc F. Lensink, Jonghun Won, Adam K. Sieradzan, Israel Desta, Xufeng Lu, Charles Christoffer, Anna Antoniak, Taeyong Park, Sheng-You Huang, Tsukasa Nakamura, Brian G. Pierce, Usman Ghani, Yang Shen, Luigi Cavallo, Chaok Seok, Hao Li, Nurul Nadzirin, Ghazaleh Taherzadeh, Jacob Verburgt, Rodrigo V. Honorato, Artur Giełdoń, Jeffrey J. Gray, Dima Kozakov, Ming Liu, Shan Chang, Eiichiro Ichiishi, Manon Réau, Rui Duan, Francesco Ambrosetti, Johnathan D. Guest, Juan Fernández-Recio, Alexandre M. J. J. Bonvin, Ilya A. Vakser, Farhan Quadir, Yumeng Yan, Ren Kong, Sameer Velankar, Sergei Grudinin, Mateusz Kogut, Mikhail Ignatov, Yasuomi Kiyota, Hyeonuk Woo, Shoshana J. Wodak, Ameya Harmalkar, Shinpei Kobayashi, Panagiotis I. Koukos, Zhen Cao, Kliment Olechnovič, Cezary Czaplewski, Xiao Wang, Agnieszka G. Lipska, Kathryn A. Porter, Peicong Lin, Emilia A. Lubecka, Nasser Hashemi, Bin Liu, Mayuko Takeda-Shitaka, Karolina Zięba, Dzmitry Padhorny, Zhuyezi Sun, Daipayan Sarkar, Romina Oliva, Andrey Alekseenko, Siri Camee van Keulen, Mireia Rosell, Raj S. Roy, Brian Jiménez-García, Jinsol Yang, Martyna Maszota-Zieleniak, Cancer Research UK, Department of Energy and Climate Change (UK), European Commission, Institut National de Recherche en Informatique et en Automatique (France), Medical Research Council (UK), Japan Society for the Promotion of Science, Ministerio de Ciencia, Innovación y Universidades (España), Agencia Estatal de Investigación (España), National Institute of General Medical Sciences (US), National Institutes of Health (US), National Natural Science Foundation of China, National Science Foundation (US), Unité de Glycobiologie Structurale et Fonctionnelle (UGSF), Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Biomolecular Modelling Laboratory [London], The Francis Crick Institute [London], Jiangsu University of Technology [Changzhou], Department of Electrical Engineering and Computer Science [Columbia] (EECS), University of Missouri [Columbia] (Mizzou), University of Missouri System-University of Missouri System, Institute for Data Science and Informatics [Columbia], University of Gdańsk (UG), Faculty of Electronics, Telecommunications and Informatics [GUT Gdańsk] (ETI), Gdańsk University of Technology (GUT), Medical University of Gdańsk, Graduate School of Medical Sciences [Nagoya], Nagoya City University [Nagoya, Japan], International University of Health and Welfare Hospital (IUHW Hospital), Department of Chemical and Biomolecular Engineering [Baltimore], Johns Hopkins University (JHU), Bijvoet Center of Biomolecular Research [Utrecht], Utrecht University [Utrecht], Stony Brook University [SUNY] (SBU), State University of New York (SUNY), Innopolis University, Boston University [Boston] (BU), Russian Academy of Sciences [Moscow] (RAS), Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Universidad de La Rioja (UR), Algorithms for Modeling and Simulation of Nanosystems (NANO-D), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK), Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), Données, Apprentissage et Optimisation (DAO), Laboratoire Jean Kuntzmann (LJK), Université Grenoble Alpes (UGA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Huazhong University of Science and Technology [Wuhan] (HUST), Indiana University - Purdue University Indianapolis (IUPUI), Indiana University System, Graduate School of Information Sciences [Sendaï], Tohoku University [Sendai], National Institutes for Quantum and Radiological Science and Technology (QST), University of Maryland [Baltimore], King Abdullah University of Science and Technology (KAUST), University of Naples Federico II, Texas A&M University [Galveston], Seoul National University [Seoul] (SNU), Kitasato University, University of Kansas [Lawrence] (KU), Vilnius University [Vilnius], University of Missouri System, VIB-VUB Center for Structural Biology [Bruxelles], VIB [Belgium], Sub NMR Spectroscopy, Sub Overig UiLOTS, Sub Mathematics Education, NMR Spectroscopy, Université de Lille, CNRS, Unité de Glycobiologie Structurale et Fonctionnelle (UGSF) - UMR 8576, European Bioinformatics Institute [Hinxton] [EMBL-EBI], Department of Electrical Engineering and Computer Science [Columbia] [EECS], Faculty of Chemistry [Univ Gdańsk], Faculty of Electronics, Telecommunications and Informatics [GUT Gdańsk] [ETI], International University of Health and Welfare Hospital [IUHW Hospital], Johns Hopkins University [JHU], Stony Brook University [SUNY] [SBU], Department of Biomedical Engineering [Boston], Instituto de Ciencias de la Vid y el Vino [ICVV], Huazhong University of Science and Technology [Wuhan] [HUST], Indiana University - Purdue University Indianapolis [IUPUI], National Institutes for Quantum and Radiological Science and Technology [QST], King Abdullah University of Science and Technology [KAUST], Università degli Studi di Napoli 'Parthenope' = University of Naples [PARTHENOPE], Seoul National University [Seoul] [SNU], University of Kansas [Lawrence] [KU], University of Missouri [Columbia] [Mizzou], Unité de Glycobiologie Structurale et Fonctionnelle - UMR 8576 (UGSF), Université de Lille-Centre National de la Recherche Scientifique (CNRS), University of Naples Federico II = Università degli studi di Napoli Federico II, European Project: 675728,H2020,H2020-EINFRA-2015-1,BioExcel(2015), European Project: 823830,H2020-EU.1.4.1.3. Development, deployment and operation of ICT-based e-infrastructures, H2020-EU.1.4. EXCELLENT SCIENCE - Research Infrastructures ,BioExcel-2(2019), European Project: 777536,H2020-EU.1.4.1.3. Development, deployment and operation of ICT-based e-infrastructures, and H2020-EU.1.4. EXCELLENT SCIENCE - Research Infrastructures,EOSC-hub(2018)
- Subjects
Models, Molecular ,blind prediction ,CAPRI ,CASP ,docking ,oligomeric state ,protein assemblies ,protein complexes ,protein docking ,protein–protein interaction ,template-based modeling ,Computer science ,[SDV]Life Sciences [q-bio] ,Machine learning ,computer.software_genre ,Biochemistry ,Article ,protein-protein interaction ,03 medical and health sciences ,Sequence Analysis, Protein ,Structural Biology ,Server ,Protein Interaction Domains and Motifs ,Molecular Biology ,ComputingMilieux_MISCELLANEOUS ,030304 developmental biology ,0303 health sciences ,Binding Sites ,business.industry ,030302 biochemistry & molecular biology ,Computational Biology ,Proteins ,3. Good health ,Molecular Docking Simulation ,Artificial intelligence ,business ,computer ,Software - Abstract
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands., Cancer Research UK, Grant/Award Number: FC001003; Changzhou Science and Technology Bureau, Grant/Award Number: CE20200503; Department of Energy and Climate Change, Grant/Award Numbers: DE-AR001213, DE-SC0020400, DE-SC0021303; H2020 European Institute of Innovation and Technology, Grant/Award Numbers: 675728, 777536, 823830; Institut national de recherche en informatique et en automatique (INRIA), Grant/Award Number: Cordi-S; Lietuvos Mokslo Taryba, Grant/Award Numbers: S-MIP-17-60, S-MIP-21-35; Medical Research Council, Grant/Award Number: FC001003; Japan Society for the Promotion of Science KAKENHI, Grant/Award Number: JP19J00950; Ministerio de Ciencia e Innovación, Grant/Award Number: PID2019-110167RB-I00; Narodowe Centrum Nauki, Grant/Award Numbers: UMO-2017/25/B/ST4/01026, UMO-2017/26/M/ST4/00044, UMO-2017/27/B/ST4/00926; National Institute of General Medical Sciences, Grant/Award Numbers: R21GM127952, R35GM118078, RM1135136, T32GM132024; National Institutes of Health, Grant/Award Numbers: R01GM074255, R01GM078221, R01GM093123, R01GM109980, R01GM133840, R01GN123055, R01HL142301, R35GM124952, R35GM136409; National Natural Science Foundation of China, Grant/Award Number: 81603152; National Science Foundation, Grant/Award Numbers: AF1645512, CCF1943008, CMMI1825941, DBI1759277, DBI1759934, DBI1917263, DBI20036350, IIS1763246, MCB1925643; NWO, Grant/Award Number: TOP-PUNT 718.015.001; Wellcome Trust, Grant/Award Number: FC001003
- Published
- 2021
66. PDBx/mmCIF Ecosystem: Foundational Semantic Tools for Structural Biology
- Author
-
John D. Westbrook, Jasmine Y. Young, Chenghua Shao, Zukang Feng, Vladimir Guranovic, Catherine L. Lawson, Brinda Vallat, Paul D. Adams, John M Berrisford, Gerard Bricogne, Kay Diederichs, Robbie P. Joosten, Peter Keller, Nigel W. Moriarty, Oleg V. Sobolev, Sameer Velankar, Clemens Vonrhein, David G. Waterman, Genji Kurisu, Helen M. Berman, Stephen K. Burley, and Ezra Peisach
- Subjects
Biochemistry & Molecular Biology ,Macromolecular Substances ,Protein Conformation ,Bioengineering ,Microbiology ,Databases ,Medicinal and Biomolecular Chemistry ,Structural Biology ,Underpinning research ,ddc:570 ,protein data bank ,Databases, Protein ,Molecular Biology ,Crystallography ,Protein ,Computational Biology ,macromolecular structure ,1.5 Resources and infrastructure (underpinning) ,Semantics ,Networking and Information Technology R&D (NITRD) ,data standard ,data management ,Generic health relevance ,Biochemistry and Cell Biology ,biological data ,Software - Abstract
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide. published
- Published
- 2021
67. Characterizing and explaining impact of disease-associated mutations in proteins without known structures or structural homologues
- Author
-
David Baker, Christine A. Orengo, Nicola Bordin, Ian Sillitoe, Neeladri Sen, Sameer Velankar, and Ivan Anishchenko
- Subjects
Protein structure ,Mechanism (biology) ,Model quality ,Computational biology ,Disease ,Biology ,Human proteins - Abstract
Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologues. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologues in the Protein Databank (PDB). We noticed that the model quality was higher and the RMSD lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein-protein interfaces, conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, if they destabilized the protein structure based on ddG calculations or if they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms a larger percentage of disease associated missense mutations were buried, closer to predicted functional sites, predicted as destabilising and/or pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
- Published
- 2021
68. Author response for 'Prediction of protein assemblies, the next frontier: The CASP14‐CAPRI experiment'
- Author
-
Yumeng Yan, Mateusz Kogut, Sohee Kwon, Israel Desta, Petras J. Kundrotas, Xiaoqin Zou, Xiao Wang, Dima Kozakov, Eiichiro Ichiishi, Kathryn A. Porter, Johnathan D. Guest, Brian G. Pierce, Daisuke Kihara, Česlovas Venclovas, Agnieszka G. Lipska, Luigi Cavallo, Panagiotis I. Koukos, Yang Shen, Ren Kong, Brian Jiménez-García, Kliment Olechnovič, Cezary Czaplewski, Peicong Lin, Sameer Velankar, Shoshana J. Wodak, Agnieszka S. Karczyńska, Emilia A. Lubecka, Mikhail Ignatov, Shan Chang, Daipayan Sarkar, Sheng-You Huang, Chaok Seok, Nurul Nadzirin, Hao Li, Anna Antoniak, Manon Réau, Hyeonuk Woo, Siri Camee van Keulen, Ryota Ashizawa, Nasser Hashemi, Adam Liwo, Zhen Cao, Yoshiki Harada, Genki Terashi, Ameya Harmalkar, Farhan Quadir, Shinpei Kobayashi, Sandor Vajda, Zuzana Jandova, Juan Fernández-Recio, Amar Singh, Martyna Maszota-Zieleniak, Rodrigo V. Honorato, Usman Ghani, Sergei Grudinin, Xufeng Lu, Jorge Roel-Touris, Ming Liu, Paul A. Bates, Ghazaleh Taherzadeh, Adam K. Sieradzan, Patryk A. Wesołowski, Théo Mauri, Ilya A. Vakser, Francesco Ambrosetti, Jinsol Yang, Sergei Kotelnikov, Hang Shi, Shuang Zhang, Marc F. Lensink, Justas Dapkūnas, Yasuomi Kiyota, Taeyong Park, Mayuko Takeda-Shitaka, Andrey Alekseenko, Jian Liu, Artur Giełdoń, Ragul Gowthaman, Jonghun Won, Tsukasa Nakamura, Tunde Aderinwale, Yuanfei Sun, Guillaume Brysbaert, Jeffrey J. Gray, Luis A. Rodríguez-Lumbreras, Yuya Hanazono, Charlotte W. van Noort, Carlos A. Del Carpio Muñoz, Rui Duan, Alexandre M. J. J. Bonvin, Jianlin Cheng, Liming Qiu, Tereza Clarence, Rui Yin, Guangbo Yang, Shaowen Zhu, Didier Barradas-Bautista, Rafał Ślusarz, Raphael A. G. Chaleil, Charles Christoffer, Jacob Verburgt, Dzmitry Padhorny, Zhuyezi Sun, Romina Oliva, Mireia Rosell, Raj S. Roy, Bin Liu, and Karolina Zięba
- Subjects
Frontier ,Computer science ,Econometrics - Published
- 2021
69. The Protein Data Bank Archive
- Author
-
Sameer, Velankar, Stephen K, Burley, Genji, Kurisu, Jeffrey C, Hoch, and John L, Markley
- Subjects
Europe ,Models, Molecular ,User-Computer Interface ,Japan ,Macromolecular Substances ,Protein Conformation ,Proteins ,Reproducibility of Results ,Crystallography, X-Ray ,Databases, Protein ,Nuclear Magnetic Resonance, Biomolecular ,Data Curation ,Data Accuracy - Abstract
Protein Data Bank is the single worldwide archive of experimentally determined macromolecular structure data. Established in 1971 as the first open access data resource in biology, the PDB archive is managed by the worldwide Protein Data Bank (wwPDB) consortium which has four partners-the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu ). The PDB archive currently includes ~175,000 entries. The wwPDB has established a number of task forces and working groups that bring together experts form the community who provide recommendations on improving data standards and data validation for improving data quality and integrity. The wwPDB members continue to develop the joint deposition, biocuration, and validation system (OneDep) to improve data quality and accommodate new data from emerging techniques such as 3DEM. Each PDB entry contains coordinate model and associated metadata for all experimentally determined atomic structures, experimental data for the traditional structure determination techniques (X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy), validation reports, and additional information on quaternary structures. The wwPDB partners are committed to following the FAIR (Findability, Accessibility, Interoperability, and Reproducibility) principles and have implemented a DOI resolution mechanism that provides access to all the relevant files for a given PDB entry. On average,250 new entries are added to the archive every week and made available by each wwPDB partner via FTP area. The wwPDB partner sites also develop data access and analysis tools and make these available via their websites. wwPDB continues to work with experts in the community to establish a federation of archives for archiving structures determined using integrative/hybrid method where multiple experimental techniques are used.
- Published
- 2021
70. Modeling protein interactions and complexes in CAPRI: Seventh CAPRI evaluation meeting, April 3‐5 EMBL‐EBI, Hinxton, UK
- Author
-
Shoshana J. Wodak, Michael J.E. Sternberg, and Sameer Velankar
- Subjects
Structural Biology ,Computational biology ,Biology ,Molecular Biology ,Biochemistry ,Protein–protein interaction - Published
- 2020
71. PDBe-KB: a community-driven resource for structural and functional annotations
- Author
-
Harry Jubb, Aleksandras Gutmanas, Radka Svobodová, Stephen Anyango, Sreenath Nair, Manjeet Kumar, Jonathan D. Tyzack, Leandro G Radusky, Toby J. Gibson, Liang-Chin Huang, Luis Serrano, Eloy Villasclaras Fernandez, Sameer Velankar, Petr Škoda, Michael J.E. Sternberg, Mark N. Wass, Fábio Madeira, Christine A. Orengo, Rishabh Jain, Stuart A. MacGowan, Patrizio Di Micco, Sayoni Das, Emmanuel D. Levy, Natarajan Kannan, John M. Berrisford, Tom L. Blundell, Janet M. Thornton, Radoslav Krivak, Christos C. Kannas, Lukáš Pravda, Bissan Al-Lazikani, Jose M. Dana, Abhik Mukhopadhyay, David R. Armstrong, Saqib Mir, Mihaly Varadi, Franca Fraternali, Karel Berka, Mallur S. Madhusudhan, Jake E McGreig, Mandar Deshpande, Neera Borkakoti, Luca Parca, António J. M. Ribeiro, Ian Sillitoe, Henry J Martell, Manuela Helmer-Citterich, Sucharita Dey, David Hoksza, Gulzar Singh, Jaroslav Koča, Typhaine Paysan-Lafosse, Geoffrey J. Barton, Alfonso Valencia, Wim F. Vranken, Biotechnology and Biological Sciences Research Council (BBSRC), Faculty of Sciences and Bioengineering Sciences, Basic (bio-) Medical Sciences, Chemistry, Informatics and Applied Informatics, Department of Bio-engineering Sciences, and Apollo - University of Cambridge Repository
- Subjects
Knowledge Bases ,05 Environmental Sciences ,Interoperability ,Protein Data Bank (RCSB PDB) ,Context (language use) ,Biology ,Market fragmentation ,Workflow ,Set (abstract data type) ,03 medical and health sciences ,User-Computer Interface ,0302 clinical medicine ,Genetics ,Database Issue ,PDBe-KB consortium ,Databases, Protein ,030304 developmental biology ,0303 health sciences ,Internet ,Information retrieval ,Settore BIO/11 ,Proteins ,computer.file_format ,06 Biological Sciences ,Protein Data Bank ,Europe ,Data exchange ,08 Information and Computing Sciences ,UniProt ,computer ,030217 neurology & neurosurgery ,Developmental Biology - Abstract
The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural and functional annotations of macromolecular structure data, contained in the Protein Data Bank (PDB). The goal of PDBe-KB is two-fold: (i) to increase the visibility and reduce the fragmentation of annotations contributed by specialist data resources, and to make these data more findable, accessible, interoperable and reusable (FAIR) and (ii) to place macromolecular structure data in their biological context, thus facilitating their use by the broader scientific community in fundamental and applied research. Here, we describe the guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembled data, and programmatic access endpoints. We also describe a series of novel web-pages—the PDBe-KB aggregated views of structure data—which combine information on macromolecular structures from many PDB entries. We have recently released the first set of pages in this series, which provide an overview of available structural and functional information for a protein of interest, referenced by a UniProtKB accession.
- Published
- 2019
72. Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB)
- Author
-
Marcin Wojdyr, Paul D. Adams, David G. Brown, Zukang Feng, Yasuyo Ikegawa, Lora Mak, Minyu Chen, Billy K. Poon, John D. Westbrook, Ezra Peisach, Jasmine Young, Helen M. Berman, Masashi Yokochi, Nigel W. Moriarty, Yu-He Liang, Stephen K. Burley, John L. Markley, Garib N. Murshudov, John M. Berrisford, Eldon L. Ulrich, Sameer Velankar, Yumiko Kengaku, Aleksandras Gutmanas, Dorothee Liebschner, C. Flensburg, Pavel V. Afonine, Kumaran Baskaran, Clemens Vonrhein, Jeffrey C. Hoch, Eugene Krissinel, Irina Persikova, Genji Kurisu, Oleg V. Sobolev, Martin E.M. Noble, and Gérard Bricogne
- Subjects
PDB ,PDBx ,Protein Conformation ,Computer science ,Biophysics ,Protein Data Bank (RCSB PDB) ,Crystallography, X-Ray ,010402 general chemistry ,01 natural sciences ,Databases ,OneDep ,03 medical and health sciences ,macromolecular crystallography ,Protein Data Bank ,Structural Biology ,Humans ,Letters to the Editor ,Databases, Protein ,Worldwide Protein Data Bank ,030304 developmental biology ,PDBx/mmCIF format ,validation ,0303 health sciences ,Crystallography ,Protein ,biocuration ,Macromolecular crystallography ,Proteins ,computer.file_format ,mmCIF format ,Biological Sciences ,Data dictionary ,0104 chemical sciences ,data archiving ,Physical Sciences ,Chemical Sciences ,X-Ray ,mmCIF ,wwPDB ,Database Management Systems ,data dictionary ,data standards ,computer ,Software - Abstract
This letter announces that PDBx/mmCIF format files will become mandatory for crystallographic depositions to the Protein Data Bank (PDB).
- Published
- 2019
73. Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures
- Author
-
Mandar Deshpande, Sebastian Bittrich, Karel Berka, Sameer Velankar, Alexander S. Rose, Jaroslav Koča, Radka Svobodová, Václav Bazgier, Stephen K. Burley, and David Sehnal
- Subjects
Models, Molecular ,0303 health sciences ,Internet ,business.industry ,Macromolecular Substances ,Protein Conformation ,AcademicSubjects/SCI00010 ,Data management ,Protein Data Bank (RCSB PDB) ,Context (language use) ,Biology ,Visualization ,03 medical and health sciences ,0302 clinical medicine ,Software ,Computer graphics (images) ,Web Server Issue ,Genetics ,Web application ,The Internet ,Graphics ,business ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
Large biomolecular structures are being determined experimentally on a daily basis using established techniques such as crystallography and electron microscopy. In addition, emerging integrative or hybrid methods (I/HM) are producing structural models of huge macromolecular machines and assemblies, sometimes containing 100s of millions of non-hydrogen atoms. The performance requirements for visualization and analysis tools delivering these data are increasing rapidly. Significant progress in developing online, web-native three-dimensional (3D) visualization tools was previously accomplished with the introduction of the LiteMol suite and NGL Viewers. Thereafter, Mol* development was jointly initiated by PDBe and RCSB PDB to combine and build on the strengths of LiteMol (developed by PDBe) and NGL (developed by RCSB PDB). The web-native Mol* Viewer enables 3D visualization and streaming of macromolecular coordinate and experimental data, together with capabilities for displaying structure quality, functional, or biological context annotations. High-performance graphics and data management allows users to simultaneously visualise up to hundreds of (superimposed) protein structures, stream molecular dynamics simulation trajectories, render cell-level models, or display huge I/HM structures. It is the primary 3D structure viewer used by PDBe and RCSB PDB. It can be easily integrated into third-party services. Mol* Viewer is open source and freely available at https://molstar.org/., Graphical Abstract Graphical AbstractOverview of the large array of entities and systems that can be visualized and be manipulated with by the Mol* Viewer.
- Published
- 2021
74. Modernized uniform representation of carbohydrate molecules in the Protein Data Bank
- Author
-
John D. Westbrook, Stephen K. Burley, Zukang Feng, Ezra Peisach, Chenghua Shao, John M. Berrisford, Jasmine Young, Genji Kurisu, Sameer Velankar, and Yasuyo Ikegawa
- Subjects
0303 health sciences ,Computer science ,030302 biochemistry & molecular biology ,Protein Data Bank (RCSB PDB) ,Carbohydrates ,Findability ,Proteins ,Structure validation ,computer.file_format ,Computational biology ,Data dictionary ,Protein Data Bank ,External Data Representation ,Biochemistry ,03 medical and health sciences ,Reference data ,Crystallographic Information File ,Structural Biology ,Databases, Protein ,computer ,030304 developmental biology - Abstract
Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability–Accessibility–Interoperability–Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.
- Published
- 2021
75. A structural biology community assessment of AlphaFold 2 applications
- Author
-
Pires Dev, Janet M. Thornton, Kundrotas P, Roman A. Laskowski, Jänes J, Tristan I. Croll, Rodrigues Chm, Mehmet Akdel, Sameer Velankar, Bryant P, Alistair Dunham, Durairaj J, Amelie Stein, Wensi Zhu, David F. Burke, Gabriele Pozzati, Norman E. Davey, Arthur O. Zalevsky, Alfonso Valencia, Porta Pardo E, Shenoy A, Liam Good, Sergey Ovchinnikov, Arne Elofsson, Kresten Lindorff-Larsen, Ruiz Serra, Pedro Beltrao, Bálint Mészáros, Adam Frost, David B. Ascher, and Neera Borkakoti
- Subjects
Science research ,Protein structure ,Structural biology ,Computer science ,Protein Data Bank (RCSB PDB) ,Computational biology - Abstract
Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods have led to protein structure predictions that have reached the accuracy of experimentally determined models. While this has been independently verified, the implementation of these methods across structural biology applications remains to be tested. Here, we evaluate the use of AlphaFold 2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modelling of interactions; and modelling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modelled when compared to homology modelling, identifying structural features rarely seen in the PDB. AF2-based predictions of protein disorder and protein complexes surpass state-of-the-art tools and AF2 models can be used across diverse applications equally well compared to experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life science research.
- Published
- 2021
- Full Text
- View/download PDF
76. Cover Image, Volume 88, Issue 8
- Author
-
Marc F. Lensink, Nurul Nadzirin, Sameer Velankar, and Shoshana J. Wodak
- Subjects
Structural Biology ,Molecular Biology ,Biochemistry - Published
- 2020
77. BinaryCIF and CIFTools-Lightweight, efficient and extensible macromolecular data management
- Author
-
Radka Svobodová, Stephen K. Burley, Jaroslav Koča, Sameer Velankar, Alexander S. Rose, Sebastian Bittrich, and David Sehnal
- Subjects
0301 basic medicine ,Models, Molecular ,Research Facilities ,Computer science ,Data management ,Interoperability ,computer.software_genre ,Information Centers ,Database and Informatics Methods ,0302 clinical medicine ,Biology (General) ,Data Management ,Crystallography ,Ecology ,Database ,Molecular Structure ,Archives ,Physics ,computer.file_format ,Data dictionary ,Macromolecular Crystallography ,Chemistry ,Computational Theory and Mathematics ,Macromolecules ,Modeling and Simulation ,Physical Sciences ,Crystallographic Techniques ,Engineering and Technology ,TypeScript ,Research Article ,Computer and Information Sciences ,QH301-705.5 ,Macromolecular Substances ,Serialization ,Viral Structure ,Research and Analysis Methods ,Microbiology ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Virology ,Quantization ,Genetics ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Chemical Physics ,Data curation ,business.industry ,Biology and Life Sciences ,Data Compression ,Polymer Chemistry ,Visualization ,Crystallographic Information File ,030104 developmental biology ,Biological Databases ,Signal Processing ,business ,computer ,030217 neurology & neurosurgery ,Databases, Chemical ,Software - Abstract
3D macromolecular structural data is growing ever more complex and plentiful in the wake of substantive advances in experimental and computational structure determination methods including macromolecular crystallography, cryo-electron microscopy, and integrative methods. Efficient means of working with 3D macromolecular structural data for archiving, analyses, and visualization are central to facilitating interoperability and reusability in compliance with the FAIR Principles. We address two challenges posed by growth in data size and complexity. First, data size is reduced by bespoke compression techniques. Second, complexity is managed through improved software tooling and fully leveraging available data dictionary schemas. To this end, we introduce BinaryCIF, a serialization of Crystallographic Information File (CIF) format files that maintains full compatibility to related data schemas, such as PDBx/mmCIF, while reducing file sizes by more than a factor of two versus gzip compressed CIF files. Moreover, for the largest structures, BinaryCIF provides even better compression-factor ten and four versus CIF files and gzipped CIF files, respectively. Herein, we describe CIFTools, a set of libraries in Java and TypeScript for generic and typed handling of CIF and BinaryCIF files. Together, BinaryCIF and CIFTools enable lightweight, efficient, and extensible handling of 3D macromolecular structural data.
- Published
- 2020
78. Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation
- Author
-
Christine A. Orengo, Lawrence A. Kelley, Daniel W. A. Buchan, Gustavo A. Salazar, Julian Gough, Michael J.E. Sternberg, Tom L. Blundell, Arun Prasad Pandurangan, Robert D. Finn, Antonina Andreeva, Sameer Velankar, Alexey G. Murzin, Su Datt Lam, Marcin J. Skwark, Ian Sillitoe, Typhaine Paysan-Lafosse, David T. Jones, Blundell, Tom [0000-0002-2708-8992], Skwark, Marcin [0000-0002-2022-6766], Apollo - University of Cambridge Repository, and Biotechnology and Biological Sciences Research Council (BBSRC)
- Subjects
InterPro ,Information retrieval ,05 Environmental Sciences ,Proteins ,06 Biological Sciences ,Biology ,Data submission ,Annotation ,User-Computer Interface ,Resource (project management) ,Workflow ,Genetics ,Selection (linguistics) ,Database Issue ,08 Information and Computing Sciences ,UniProt ,Databases, Protein ,Protocol (object-oriented programming) ,Genetics & Genomics ,Developmental Biology ,Structural Biology & Biophysics ,Computational & Systems Biology - Abstract
Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being ‘pushed’ to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.
- Published
- 2020
79. Protein Data Bank: the single global archive for 3D macromolecular structure data
- Author
-
Masashi Yokochi, Ju Yaen Kim, Chenghua Shao, John M. Berrisford, Hongyang Yao, Miron Livny, Stephen Anyango, Abhik Mukhopadhyay, Romana Gáborová, Yi-Ping Tao, Monica Sekharan, Aleksandras Gutmanas, Jose M. Dana, Mandar Deshpande, Charmi Bhikadiya, Yannis Ioannidis, Pedro Romero, Jonathan R. Wedell, Eldon L. Ulrich, Gert-Jan Bekker, Chris Randle, Chunxiao Bi, Jeffrey C. Hoch, Nurul Nadzirin, Jaroslav Koča, Yumiko Kengaku, Jasmine Young, Cole Christie, John D. Westbrook, Naohiro Kobayashi, Alexander S. Rose, Sameer Velankar, David Sehnal, Lukáš Pravda, David R. Armstrong, Hasumi Cho, Genji Kurisu, Lora Mak, John L. Markley, Saqib Mir, Sutapa Ghosh, Ardan Patwardhan, Zukang Feng, Stephen K. Burley, Robert Lowe, David S. Goodsell, Hirofumi Suzuki, Maria Voigt, Paul Gane, Jose M. Duarte, Osman Salih, Irina Periskova, Matthew J. Conroy, Toshimichi Fujiwara, Yasuyo Ikegawa, Takahiro Kudou, Dimitri Maziuk, Typhaine Paysan-Lafosse, Brian P. Hudson, Christine Zardecki, Sreenath Nair, Gerard J. Kleywegt, Marina A. Zhuravleva, Shuchismita Dutta, Dmytro Guzenko, Kumaran Baskaran, Rachel Kramer Green, Ezra Peisach, Li Chen, Reiko Yamashita, Vladimir Guranovic, Yu-He Liang, Takeshi Iwata, Atsushi Nakagawa, Haruki Nakamura, Junko Sato, Radka Svobodová Vařeková, Helen M. Berman, Deepti Gupta, Luigi Di Costanzo, Mihaly Varadi, Yana Valasatava, Burley, S. K., Berman, H. M., Bhikadiya, C., Bi, C., Chen, L., DI COSTANZO, Luigi, Addeo, PIETRO FRANCESCO BRUNO CHRISTI, Duarte, J. M., Dutta, S., Feng, Z., Ghosh, S., Goodsell, D. S., Green, R. K., Guranovic, V., Guzenko, D., Hudson, B. P., Liang, Y., Lowe, R., Peisach, E., Periskova, I., Randle, C., Rose, A., Sekharan, M., Shao, C., Tao, Y. -P., Valasatava, Y., Voigt, M., Westbrook, J., Young, J., Zardecki, C., Zhuravleva, M., Kurisu, G., Nakamura, H., Kengaku, Y., Cho, H., Sato, J., Kim, J. Y., Ikegawa, Y., Nakagawa, A., Yamashita, R., Kudou, T., Bekker, G. -J., Suzuki, H., Iwata, T., Yokochi, M., Kobayashi, N., Fujiwara, T., Velankar, S., Kleywegt, G. J., Anyango, S., Armstrong, D. R., Berrisford, J. M., Conroy, M. J., Dana, J. M., Deshpande, M., Gane, P., Gaborova, R., Gupta, D., Gutmanas, A., Koca, J., Mak, L., EL MIR, Abdelouahad, Mukhopadhyay, A., Nadzirin, N., Nair, S., Patwardhan, A., Paysan-Lafosse, T., Pravda, L., Salih, O., Sehnal, D., Varadi, M., Varekova, R., Markley, J. L., Hoch, J. C., Romero, P. R., Baskaran, K., Maziuk, D., Ulrich, E. L., Wedell, J. R., Sicong, Yao, Livny, M., and Ioannidis, Y. E.
- Subjects
Models, Molecular ,Protein Conformation ,Molecular Conformation ,Protein Data Bank (RCSB PDB) ,Master data ,Biology ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Database Issue ,RDF ,Databases, Protein ,030304 developmental biology ,Structure (mathematical logic) ,0303 health sciences ,Database ,Experimental data ,DNA ,computer.file_format ,Atomic coordinates ,Protein Data Bank ,Metadata ,Metals ,Nucleic Acid Conformation ,RNA ,computer ,030217 neurology & neurosurgery - Abstract
The Protein Data Bank (PDB) is the single global archive of experimentally determined three-dimensional (3D) structure data of biological macromolecules. Since 2003, the PDB has been managed by the Worldwide Protein Data Bank (wwPDB; wwpdb.org), an international consortium that collaboratively oversees deposition, validation, biocuration, and open access dissemination of 3D macromolecular structure data. The PDB Core Archive houses 3D atomic coordinates of more than 144 000 structural models of proteins, DNA/RNA, and their complexes with metals and small molecules and related experimental data and metadata. Structure and experimental data/metadata are also stored in the PDB Core Archive using the readily extensible wwPDB PDBx/mmCIF master data format, which will continue to evolve as data/metadata from new experimental techniques and structure determination methods are incorporated by the wwPDB. Impacts of the recently developed universal wwPDB OneDep deposition/validation/biocuration system and various methods-specific wwPDB Validation Task Forces on improving the quality of structures and data housed in the PDB Core Archive are described together with current challenges and future plans.
- Published
- 2018
80. Federating Structural Models and Data: Outcomes from A Workshop on Archiving Integrative Structures
- Author
-
Kate L. White, Frank DiMaio, Thomas D. Goddard, David C. Schriemer, Andrej Sali, Emad Tajkhorshid, Sameer Velankar, Christian A. Hanke, Jeffrey C. Hoch, Catherine L. Lawson, Brinda Vallat, Margaret Gabanyi, Benjamin Webb, Gerhard Hummer, Patrick R. Griffin, Alexandre A. Bonvin, Bridget Carragher, John L. Markley, Gaetano T. Montelione, Paul D. Adams, John D. Westbrook, Genji Kurisu, Jill Trewhella, Jens Meiler, Geerten W. Vuister, Thomas F. Prisner, Dmitri I. Svergun, Torsten Schwede, Helen M. Berman, George N. Phillips, Stephen K. Burley, Juri Rappsilber, Claus A. M. Seidel, Wah Chiu, Timothy S. Strutzenberg, Thomas E. Ferrin, Alexander Leitner, and Juergen Haas
- Subjects
Magnetic Resonance Spectroscopy ,Protein Conformation ,Computer science ,Biophysics ,Article ,Databases ,03 medical and health sciences ,Models ,Structural Biology ,Information and Computing Sciences ,Taverne ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Crystallography ,Extramural ,Protein ,030302 biochemistry & molecular biology ,Proteins ,Computational Biology ,Molecular ,Biological Sciences ,Data science ,Data exchange ,Chemical Sciences ,X-Ray ,Experimental methods - Abstract
Structures of biomolecular systems are increasingly computed by integrative modeling. In this approach, a structural model is constructed by combining information from multiple sources, including varied experimental methods and prior models. In 2019, a Workshop was held as a Biophysical Society Satellite Meeting to assess progress and discuss further requirements for archiving integrative structures. The primary goal of the Workshop was to build consensus for addressing the challenges involved in creating common data standards , building methods for federated data exchange, and developing mechanisms for validating integrative structures. The summary of the Workshop and the recommendations that emerged are presented here. Introduction When the Protein Data Bank (PDB) (Protein Data Bank, 1971) was first established in 1971, X-ray crystallography was the only method for determining three-dimensional structures of biological macromolecules at sufficient resolution to build atomic models. A decade later, structures of biomolecules in solution could also be determined by nuclear magnetic resonance (NMR) spectroscopy (Williamson et al., 1985). Recently, three-dimensional cryoelectron microscopy (3DEM) (Henderson et al., 1990) began to achieve unprecedented near-atomic resolution for large complex assemblies. Increasingly, investigators are also modeling structures based on data from more than one method (Rout and Sali, 2019). These integrative/hybrid approaches to structure determination consist of collecting information about a system using multiple experimental and computational methods, followed by integrative/hybrid modeling that converts this information into integrative/hybrid structure models. For succinctness, we will use the term integra-tive hereafter to refer to integrative/hybrid approaches, modeling, and models.
- Published
- 2019
81. PDBe: improved findability of macromolecular structure data in the PDB
- Author
-
Nurul Nadzirin, Stephen Anyango, Mandar Deshpande, Radka SvobodováVařeková, Saqib Mir, Osman Salih, Mihaly Varadi, Gerard J. Kleywegt, David R. Armstrong, Lukáš Pravda, Jose M. Dana, James Tolchard, Preeti Choudhary, Alice R. Clark, Paul Gane, Matthew J. Conroy, Hossam Zaki, Sreenath Nair, Sameer Velankar, Typhaine Paysan-Lafosse, Aleksandras Gutmanas, David Sehnal, Deepti Gupta, Jaroslav Koča, Oliver S. Smart, John M. Berrisford, Abhik Mukhopadhyay, Romana Gáborová, Pauline Haslam, Roisin Dunlop, and Lora Mak
- Subjects
Structure (mathematical logic) ,0303 health sciences ,Information retrieval ,Protein Conformation ,030302 biochemistry & molecular biology ,Protein Data Bank (RCSB PDB) ,Findability ,Rfam ,computer.file_format ,Biology ,Protein Data Bank ,Data Accuracy ,Visualization ,Europe ,User-Computer Interface ,03 medical and health sciences ,Identification (information) ,Data access ,Genetics ,Database Issue ,Cluster Analysis ,Databases, Protein ,computer ,Software ,030304 developmental biology - Abstract
The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.
- Published
- 2019
82. Structural biology data archiving - where we are and what lies ahead
- Author
-
Gerard J. Kleywegt, Ardan Patwardhan, and Sameer Velankar
- Subjects
Models, Molecular ,0301 basic medicine ,Protein Conformation ,Computer science ,Biophysics ,Review Article ,Crystallography, X-Ray ,Biochemistry ,Field (computer science) ,03 medical and health sciences ,Structural Biology ,Genetics ,Data bank ,bioimaging ,Databases, Protein ,Review Articles ,Molecular Biology ,Data Curation ,Structure (mathematical logic) ,Biological data ,Proteins ,Experimental data ,Cell Biology ,computer.file_format ,Protein Data Bank ,Data science ,Sketch ,data archiving ,030104 developmental biology ,Structural biology ,computer - Abstract
For almost 50 years, structural biology has endeavoured to conserve and share its experimental data and their interpretations (usually, atomistic models) through global public archives such as the Protein Data Bank, Electron Microscopy Data Bank and Biological Magnetic Resonance Data Bank (BMRB). These archives are treasure troves of freely accessible data that document our quest for molecular or atomic understanding of biological function and processes in health and disease. They have prepared the field to tackle new archiving challenges as more and more (combinations of) techniques are being utilized to elucidate structure at ever increasing length scales. Furthermore, the field has made substantial efforts to develop validation methods that help users to assess the reliability of structures and to identify the most appropriate data for their needs. In this Review, we present an overview of public data archives in structural biology and discuss the importance of validation for users and producers of structural data. Finally, we sketch our efforts to integrate structural data with bioimaging data and with other sources of biological data. This will make relevant structural information available and more easily discoverable for a wide range of scientists.
- Published
- 2018
83. Validation of ligands in macromolecular structures determined by X-ray crystallography
- Author
-
Oliver S. Smart, Radka Svobodová Vařeková, Gerard J. Kleywegt, Swanand Gore, Vladimír Horský, Sameer Velankar, and Veronika Bendová
- Subjects
0301 basic medicine ,Models, Molecular ,PDB ,Computer science ,Macromolecular Substances ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Computational biology ,Crystallography, X-Ray ,03 medical and health sciences ,Structural Biology ,Protein Data Bank ,Molecule ,Humans ,Binding site ,Databases, Protein ,validation ,030102 biochemistry & molecular biology ,Molecular Structure ,Ligand ,Drug discovery ,ligands ,nutritional and metabolic diseases ,Proteins ,computer.file_format ,three-dimensional macromolecular structure ,Research Papers ,nervous system diseases ,030104 developmental biology ,Data quality ,computer ,Macromolecule - Abstract
Better metrics are required to be able to assess small-molecule ligands in macromolecular structures in Worldwide Protein Data Bank validation reports. The local ligand density fit (LLDF) score currently used to assess ligand electron-density fit outliers produces a substantial number of false positives and false negatives., Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) play a crucial role in structure-guided drug discovery and design, and also provide atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. The quality with which small-molecule ligands have been modelled in Protein Data Bank (PDB) entries has been, and continues to be, a matter of concern for many investigators. Correctly interpreting whether electron density found in a binding site is compatible with the soaked or co-crystallized ligand or represents water or buffer molecules is often far from trivial. The Worldwide PDB validation report (VR) provides a mechanism to highlight any major issues concerning the quality of the data and the model at the time of deposition and annotation, so the depositors can fix issues, resulting in improved data quality. The ligand-validation methods used in the generation of the current VRs are described in detail, including an examination of the metrics to assess both geometry and electron-density fit. It is found that the LLDF score currently used to identify ligand electron-density fit outliers can give misleading results and that better ligand-validation metrics are required.
- Published
- 2018
84. PDBe: towards reusable data delivery infrastructure at protein data bank in Europe
- Author
-
Stephen Anyango, Sanchayita Sen, David Sehnal, Mihaly Varadi, Deepti Gupta, Pauline Haslam, Lora Mak, Matthew J. Conroy, David R. Armstrong, Jose M. Dana, Younes Alhroub, John M. Berrisford, Nurul Nadzirin, Abhik Mukhopadhyay, Aleksandras Gutmanas, Gerard J. Kleywegt, Sameer Velankar, Mandar Deshpande, Saqib Mir, Oliver S. Smart, Alice R. Clark, and Typhaine Paysan-Lafosse
- Subjects
Models, Molecular ,Protein Conformation, alpha-Helical ,0301 basic medicine ,Information Dissemination ,Context (language use) ,Biology ,World Wide Web ,User-Computer Interface ,03 medical and health sciences ,Resource (project management) ,Sequence Analysis, Protein ,Computer Graphics ,Genetics ,Humans ,Database Issue ,Amino Acid Sequence ,Databases, Protein ,Dissemination ,Reusability ,Internet ,business.industry ,Suite ,Computational Biology ,Proteins ,Molecular Sequence Annotation ,computer.file_format ,Protein Data Bank ,Europe ,030104 developmental biology ,Databases as Topic ,Protein Conformation, beta-Strand ,The Internet ,business ,computer - Abstract
The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.
- Published
- 2017
85. BioJS: an open source JavaScript framework for biological data visualization.
- Author
-
John Gómez, Leyla J. García, Gustavo A. Salazar, Jose M. Villaveces, Swanand P. Gore, Alexander García Castro, Maria Jesus Martin, Guillaume Launay, Rafael Alcántara, Noemi del-Toro, Marine Dumousseau, Sandra E. Orchard, Sameer Velankar, Henning Hermjakob, Chenggong Zong, Peipei Ping, Manuel Corpas, and Rafael C. Jiménez
- Published
- 2013
- Full Text
- View/download PDF
86. PDBe-KB COVID-19 data portal – supporting rapid coronavirus research
- Author
-
Sameer Velankar
- Subjects
Inorganic Chemistry ,Structural Biology ,General Materials Science ,Physical and Theoretical Chemistry ,Condensed Matter Physics ,Biochemistry - Published
- 2021
87. Validation of Structures in the Protein Data Bank
- Author
-
Zukang Feng, John D. Westbrook, Monica Sekharan, Yasuyo Ikegawa, Haruki Nakamura, Stephen K. Burley, Pieter M. S. Hendrickx, Chenghua Shao, Kumaran Baskaran, Aleksandras Gutmanas, Sameer Velankar, Eduardo Sanz Garcia, Ezra Peisach, Ardan Patwardhan, Thomas J. Oldfield, Catherine L. Lawson, John L. Markley, Helen M. Berman, Oliver S. Smart, John M. Berrisford, Swanand Gore, Eldon L. Ulrich, Abhik Mukhopadhyay, Gaurav Sahni, Jasmine Young, Sanchayita Sen, Gerard J. Kleywegt, Martha Quesada, Huanwang Yang, Naohiro Kobayashi, Reiko Yamashita, Brian P. Hudson, Steve Mading, and Lora Mak
- Subjects
0301 basic medicine ,Web server ,PDB ,data deposition ,Computer science ,media_common.quotation_subject ,Protein Data Bank (RCSB PDB) ,Validation Studies as Topic ,computer.software_genre ,Bioinformatics ,Article ,03 medical and health sciences ,Sequence Analysis, Protein ,structural biology ,Quality (business) ,structure data quality ,Databases, Protein ,Molecular Biology ,media_common ,Structure (mathematical logic) ,validation ,Data processing ,Information retrieval ,biocuration ,Experimental data ,computer.file_format ,Protein Data Bank ,3D macromolecular structure ,data archiving ,030104 developmental biology ,wwPDB ,Web service ,computer - Abstract
Summary The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade., Graphical Abstract, Highlights • Validation reports are available for X-ray, NMR, and EM structures in the PDB • Preliminary reports obtained at deposition, stand-alone servers, and programmatically • Official reports from biocuration should be submitted with manuscripts • Quality metrics for protein structures have improved over the past decade, Gore et al. describe the community-recommended validation reports, produced by wwPDB at deposition and biocuration of PDB submissions, and integrated into the archive of publicly released PDB entries. The authors also show that the quality of protein structures has improved over the last decade.
- Published
- 2017
88. Modeling protein-protein and protein-peptide complexes: CAPRI 6th edition
- Author
-
Sameer Velankar, Shoshana J. Wodak, and Marc F. Lensink
- Subjects
0301 basic medicine ,chemistry.chemical_classification ,Multiprotein complex ,Computer science ,Protein protein ,Peptide ,Computational biology ,Biochemistry ,Molecular Docking Simulation ,03 medical and health sciences ,030104 developmental biology ,Protein structure ,chemistry ,Structural Biology ,Docking (molecular) ,Critical assessment ,Macromolecular docking ,Molecular Biology - Abstract
We present the sixth report evaluating the performance of methods for predicting the atomic resolution structures of protein complexes offered as targets to the community-wide initiative on the Critical Assessment of Predicted Interactions (CAPRI). The evaluation is based on a total of 20,670 predicted models for 8 protein-peptide complexes, a novel category of targets in CAPRI, and 12 protein-protein targets in CAPRI prediction Rounds held during the years 2013-2016. For two of the protein-protein targets, the focus was on the prediction of side-chain conformation and positions of interfacial water molecules. Seven of the protein-protein targets were particularly challenging owing to their multicomponent nature, to conformational changes at the binding site, or to a combination of both. Encouragingly, the very large multiprotein complex with the nucleosome was correctly predicted, and correct models were submitted for the protein-peptide targets, but not for some of the challenging protein-protein targets. Models of acceptable quality or better were obtained for 14 of the 20 targets, including medium quality models for 13 targets and high quality models for 8 targets, indicating tangible progress of present-day computational methods in modeling protein complexes with increased accuracy. Our evaluation suggests that the progress stems from better integration of different modeling tools with docking procedures, as well as the use of more sophisticated evolutionary information to score models. Nonetheless, adequate modeling of conformational flexibility in interacting proteins remains an important area with a crucial need for improvement. Proteins 2017; 85:359-377. © 2016 Wiley Periodicals, Inc.
- Published
- 2016
89. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences
- Author
-
John Dylan Spalding, Guy Cochrane, Jo McEntyre, Ugis Sarkans, Heinz Stockinger, Mathias Uhlen, Pablo Porras Millan, Andrew Yates, Luana Licata, Sameer Velankar, Sandra Orchard, Alex Bateman, Nicole Redaschi, Franziska Gruhl, Robert Finn, Niklas Blomberg, Juan Antonio Vizcaino, Rodrigo López, Helen Parkinson, and Charles E Cook
- Subjects
Statistics and Probability ,Knowledge management ,Biodata ,Databases and Ontologies ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Biochemistry ,Biological Science Disciplines ,03 medical and health sciences ,0302 clinical medicine ,Resource (project management) ,Data_FILES ,ComputingMilieux_COMPUTERSANDEDUCATION ,Molecular Biology ,Letter to the Editor ,030304 developmental biology ,computer.programming_language ,0303 health sciences ,Biological data ,business.industry ,Computational Biology ,Computer Science Applications ,Computational Mathematics ,Core (game theory) ,Open data ,ComputingMethodologies_PATTERNRECOGNITION ,Computational Theory and Mathematics ,Agriculture ,Sustainability ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Elixir (programming language) ,business ,computer ,030217 neurology & neurosurgery - Abstract
Motivation: Life science research in academia, industry, agriculture, and the health sector depends critically on free and open data resources. ELIXIR (www.elixir-europe.org), the European Research Infrastructure for life sciences data, has identified a set of Core Data Resources within Europe that are of most fundamental importance for the long-term preservation of biological data. We explore characteristics of their usage, impact and assured funding horizon to assess their value and importance as an infrastructure, to understand sustainability of the infrastructure, and to demonstrate a model for assessing Core Data Resources worldwide. Results: The nineteen resources currently designated ELIXIR Core Data Resources form a data infrastructure in Europe which is a subset of the worldwide open life science data infrastructure. We show that, from 2014 to 2018, data managed by the Core Data Resources more than tripled while staff numbers increased by less than a tenth. Additionally, support for the Core Data Resources is precarious: together they have assured funding for less than a third of current staff after four years. Our findings demonstrate the importance of the ELIXIR Core Data Resources as repositories for research data and knowledge, while also demonstrating the uncertain nature of the funding environment for this infrastructure. ELIXIR is working towards longer-term support for the Core Data Resources and, through the Global Biodata Coalition, aims to ensure support for the worldwide life science data resource infrastructure of which the ELIXIR Core Data Resources are a subset.
- Published
- 2019
90. Finding enzyme cofactors in Protein Data Bank
- Author
-
Abhik Mukhopadhyay, Janet M. Thornton, Neera Borkakoti, Lukáš Pravda, Sameer Velankar, and Jonathan D. Tyzack
- Subjects
Statistics and Probability ,Computer science ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Coenzymes ,Computational biology ,Biochemistry ,Cofactor ,03 medical and health sciences ,Protein structure ,Molecule ,natural sciences ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,biology ,030302 biochemistry & molecular biology ,computer.file_format ,Protein Data Bank ,Small molecule ,Applications Notes ,Structural Bioinformatics ,Computer Science Applications ,Europe ,Computational Mathematics ,Enzyme ,Computational Theory and Mathematics ,chemistry ,biology.protein ,computer - Abstract
Motivation Cofactors are essential for many enzyme reactions. The Protein Data Bank (PDB) contains >67 000 entries containing enzyme structures, many with bound cofactor or cofactor-like molecules. This work aims to identify and categorize these small molecules in the PDB and make it easier to find them. Results The Protein Data Bank in Europe (PDBe; pdbe.org) has implemented a pipeline to identify enzyme cofactor and cofactor-like molecules, which are now part of the PDBe weekly release process. Availability and implementation Information is made available on the individual PDBe entry pages at pdbe.org and programmatically through the PDBe REST API (pdbe.org/api). Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2018
91. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank
- Author
-
Jasmine Young, Issaku Yamada, John D. Westbrook, Raul Sala, Jeffrey C. Hoch, Zukang Feng, Masaaki Matsubara, Oliver S. Smart, Sameer Velankar, Genji Kurisu, Shinichiro Tsuchiya, Gérard Bricogne, Stephen K. Burley, and Kiyoko F. Aoki-Kinoshita
- Subjects
Proteomics ,Proteome ,Computer science ,Carbohydrates ,Protein Data Bank (RCSB PDB) ,Computational biology ,Ligands ,Article ,Symbol (chemistry) ,Small Molecule Libraries ,03 medical and health sciences ,Structural Biology ,Humans ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Drug discovery ,Ligand ,Cheminformatics ,030302 biochemistry & molecular biology ,computer.file_format ,Protein Data Bank ,Small molecule ,Molecular Docking Simulation ,Structural biology ,Biochemical function ,computer ,Databases, Chemical ,Protein Binding - Abstract
Summary The Worldwide Protein Data Bank (wwPDB) has provided validation reports based on recommendations from community Validation Task Forces for structures in the PDB since 2013. To further enhance validation of small molecules as recommended from the 2016 Ligand Validation Workshop, wwPDB, Global Phasing Ltd., and the Noguchi Institute, recently formed a public/private partnership to incorporate some of their software tools into the wwPDB validation package. Augmented wwPDB validation report features include: two-dimensional (2D) diagrams of small-molecule ligands and carbohydrates, highlighting geometric validation outcomes; 2D topological diagrams of oligosaccharides present in branched entities generated using 2D Symbol Nomenclature for Glycan representation; and views of 3D electron density maps for ligands and carbohydrates, illustrating the goodness-of-fit between the atomic structure and experimental data (X-ray crystallographic structures only). These improvements will impact confidence in ligand conformation and ligand-macromolecular interactions that will aid in understanding biochemical function and contribute to small-molecule drug discovery.
- Published
- 2021
92. UniChem: a unified chemical structure cross-referencing and identifier tracking system.
- Author
-
Jon Chambers, Mark Davies, Anna Gaulton, Anne Hersey, Sameer Velankar, Robert Petryszak, Janna Hastings, Louisa J. Bellis, Shaun McGlinchey, and John P. Overington
- Published
- 2013
- Full Text
- View/download PDF
93. High-performance macromolecular data delivery and visualization for the web. Corrigendum
- Author
-
Stephen K. Burley, Jaroslav Koča, David Sehnal, Sameer Velankar, Radka Svobodová, Karel Berka, and Alexander S. Rose
- Subjects
Internet ,0303 health sciences ,Information retrieval ,Macromolecular Substances ,Computer science ,030302 biochemistry & molecular biology ,Addenda and Errata ,Visualization ,User-Computer Interface ,03 medical and health sciences ,corrigendum ,Structural Biology ,data delivery ,Computer Graphics ,macromolecules ,Data delivery ,visualization ,Software ,030304 developmental biology - Abstract
The article by Sehnal et al. [(2020), Acta Cryst. D76, 1167–1173] is corrected., Two citations in the article by Sehnal et al. [(2020), Acta Cryst. D76, 1167–1173] are corrected.
- Published
- 2021
94. SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins
- Author
-
Maria Jesus Martin, Sameer Velankar, Guoying Qi, Nidhi Tyagi, Jose M. Dana, Claire O'Donovan, and Aleksandras Gutmanas
- Subjects
InterPro ,Proteome ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Computational biology ,Biology ,03 medical and health sciences ,Mice ,0302 clinical medicine ,Protein sequencing ,Sequence Analysis, Protein ,Genetics ,Ensembl ,Animals ,Humans ,Protein Isoforms ,Database Issue ,Databases, Protein ,030304 developmental biology ,0303 health sciences ,Proteins ,Molecular Sequence Annotation ,computer.file_format ,Protein Data Bank ,Enzymes ,HomoloGene ,UniProt ,computer ,030217 neurology & neurosurgery - Abstract
The Structure Integration with Function, Taxonomy and Sequences resource (SIFTS; http://pdbe.org/sifts/) was established in 2002 and continues to operate as a collaboration between the Protein Data Bank in Europe (PDBe; http://pdbe.org) and the UniProt Knowledgebase (UniProtKB; http://uniprot.org). The resource is instrumental in the transfer of annotations between protein structure and protein sequence resources through provision of up-to-date residue-level mappings between entries from the PDB and from UniProtKB. SIFTS also incorporates residue-level annotations from other biological resources, currently comprising the NCBI taxonomy database, IntEnz, GO, Pfam, InterPro, SCOP, CATH, PubMed, Ensembl, Homologene and automatic Pfam domain assignments based on HMM profiles. The recently released implementation of SIFTS includes support for multiple cross-references for proteins in the PDB, allowing mappings to UniProtKB isoforms and UniRef90 cluster members. This development makes structure data in the PDB readily available to over 1.8 million UniProtKB accessions.
- Published
- 2018
95. The challenge of modeling protein assemblies: the CASP12-CAPRI experiment
- Author
-
Lim Heo, Sameer Velankar, Minkyung Baek, Shoshana J. Wodak, Marc F. Lensink, Chaok Seok, Unité de Glycobiologie Structurale et Fonctionnelle UMR 8576 (UGSF), Université de Lille-Institut National de la Recherche Agronomique (INRA)-Centre National de la Recherche Scientifique (CNRS), European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Department of Chemistry, Seoul National University [Seoul] (SNU), The Hospital for sick children [Toronto] (SickKids), Unité de Glycobiologie Structurale et Fonctionnelle - UMR 8576 (UGSF), Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Recherche Agronomique (INRA), and Université de Lille-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Models, Molecular ,0301 basic medicine ,Protein Conformation ,Computer science ,[SDV]Life Sciences [q-bio] ,Protein Data Bank (RCSB PDB) ,Computational biology ,Biochemistry ,03 medical and health sciences ,Sequence Analysis, Protein ,Structural Biology ,Protein Interaction Mapping ,Humans ,Databases, Protein ,CASP ,Molecular Biology ,ComputingMilieux_MISCELLANEOUS ,030102 biochemistry & molecular biology ,Quality assessment ,Computational Biology ,Proteins ,030104 developmental biology ,Template ,Assessment methods ,Protein Multimerization ,Algorithms - Abstract
We present the quality assessment of 5613 models submitted by predictor groups from both CAPRI and CASP for the total of 15 most tractable targets from the second joint CASP-CAPRI protein assembly prediction experiment. These targets comprised 12 homo-oligomers and 3 hetero-complexes. The bulk of the analysis focuses on 10 targets (of CAPRI Round 37), which included all 3 hetero-complexes, and whose protein chains or the full assembly could be readily modeled from structural templates in the PDB. On average, 28 CAPRI groups and 10 CASP groups (including automatic servers), submitted models for each of these 10 targets. Additionally, about 16 groups participated in the CAPRI scoring experiments. A range of acceptable to high quality models were obtained for 6 of the 10 Round 37 targets, for which templates were available for the full assembly. Poorer results were achieved for the remaining targets due to the lower quality of the templates available for the full complex or the individual protein chains, highlighting the unmet challenge of modeling the structural adjustments of the protein components that occur upon binding or which must be accounted for in template-based modeling. On the other hand, our analysis indicated that residues in binding interfaces were correctly predicted in a sizable fraction of otherwise poorly modeled assemblies and this with higher accuracy than published methods that do not use information on the binding partner. Lastly, the strengths and weaknesses of the assessment methods are evaluated and improvements suggested.
- Published
- 2018
96. OneDep: Unified wwPDB System for Deposition, Biocuration, and Validation of Macromolecular Structures in the PDB Archive
- Author
-
Marina Zhuravleva, Ezra Peisach, Monica Sekharan, Glen van Ginkel, Reiko Igarashi, Jasmine Young, M. Saqib Mir, Lora Mak, Dimitris Dimitropoulos, Raul Sala, David R. Armstrong, Sanchayita Sen, Sameer Velankar, Gerard J. Kleywegt, Li Chen, Lihua Tan, Swanand Gore, Reiko Yamashita, Sutapa Ghosh, Eduardo Sanz-García, Zukang Feng, John D. Westbrook, Vladimir Guranovic, Yu-He Liang, Aleksandras Gutmanas, Thomas J. Oldfield, Brian P. Hudson, Huanwang Yang, Minyu Chen, Guanghua Gao, G. Jawahar Swaminathan, Eldon L. Ulrich, Yasuyo Ikegawa, Naohiro Kobayashi, Irina Persikova, Luigi Di Costanzo, Steve Mading, John L. Markley, Chenghua Shao, Helen M. Berman, Luana Rinaldi, Ardan Patwardhan, John M. Berrisford, Abhik Mukhopadhyay, Haruki Nakamura, Stephen K. Burley, Catherine L. Lawson, Pieter M. S. Hendrickx, Martha Quesada, Young, J. Y., Westbrook, J. D., Feng, Z., Sala, R., Peisach, E., Oldfield, T. J., Sen, S., Gutmanas, A., Armstrong, D. R., Berrisford, J. M., Chen, L., Chen, M., DI COSTANZO, Luigi, Dimitropoulos, D., Gao, G., Ghosh, S., Gore, S., Guranovic, V., Hendrickx, P. M. S., Hudson, B. P., Igarashi, R., Ikegawa, Y., Kobayashi, N., Lawson, C. L., Liang, Y., Mading, S., Mak, L., Mir, M. S., Mukhopadhyay, A., Patwardhan, A., Persikova, I., Rinaldi, L., Sanz-Garcia, E., Sekharan, M. R., Shao, C., Swaminathan, G. J., Tan, L., Ulrich, E. L., van Ginkel, G., Yamashita, R., Yang, H., Zhuravleva, M. A., Quesada, M., Kleywegt, G. J., Berman, H. M., Markley, J. L., Nakamura, H., Velankar, S., and Burley, S. K.
- Subjects
0301 basic medicine ,Models, Molecular ,data deposition ,PDB ,Computer science ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Article ,03 medical and health sciences ,User-Computer Interface ,Average size ,Protein Data Bank ,structural biology ,Databases, Protein ,Molecular Biology ,Nuclear Magnetic Resonance, Biomolecular ,Data Curation ,Research data ,validation ,Internet ,business.industry ,biocuration ,Protein ,Proteins ,computer.file_format ,research data ,3D macromolecular structure ,Unified system ,data archiving ,030104 developmental biology ,wwPDB ,Software engineering ,business ,computer - Abstract
OneDep, a unified system for deposition, biocuration, and validation of experimentally determined structures of biological macromolecules to the Protein Data Bank (PDB) archive, has been developed as a global collaboration by the Worldwide Protein Data Bank (wwPDB) partners. This new system was designed to ensure that the wwPDB could meet the evolving archiving requirements of the scientific community over the coming decades. OneDep unifies deposition, biocuration, and validation pipelines across all wwPDB, EMDB, and BMRB deposition sites with improved focus on data quality and completeness in these archives, while supporting growth in the number of depositions and increases in their average size and complexity. In this paper, we describe the design, functional operation, and supporting infrastructure of the OneDep system, and provide initial performance assessments.
- Published
- 2018
- Full Text
- View/download PDF
97. Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data
- Author
-
Marina Zhuravleva, Raul Sala, Lora Mak, Stephen K. Burley, Monica Sekharan, Oliver S. Smart, Brian P. Hudson, Ardan Patwardhan, Gerard J. Kleywegt, Alice R. Clark, Guanghua Gao, Kumaran Baskaran, Sutapa Ghosh, David R. Armstrong, Kayoko Nishiyama, John M. Berrisford, Ezra Peisach, Abhik Mukhopadhyay, G. Jawahar Swaminathan, Huanwang Yang, Minyu Chen, Catherine L. Lawson, Thomas J. Oldfield, Junko Sato, Zukang Feng, Helen M. Berman, Yumiko Kengaku, Chenghua Shao, Glen van Ginkel, Irina Persikova, John L. Markley, Genji Kurisu, Yasuyo Ikegawa, Jasmine Young, Pieter M. S. Hendrickx, Luigi Di Costanzo, Aleksandras Gutmanas, John D. Westbrook, Reiko Igarashi, Buvaneswari Coimbatore Narayanan, Li Chen, Eduardo Sanz-García, Vladimir Guranovic, Yu-He Liang, Haruki Nakamura, Gaurav Sahni, Sameer Velankar, Sanchayita Sen, Lihua Tan, Swanand Gore, Dimitris Dimitropoulos, Young, J. Y., Westbrook, J. D., Feng, Z., Peisach, E., Persikova, I., Sala, R., Sen, S., Berrisford, J. M., Swaminathan, G. J., Oldfield, T. J., Gutmanas, A., Igarashi, R., Armstrong, D. R., Baskaran, K., Chen, L., Chen, M., Clark, A. R., DI COSTANZO, Luigi, Dimitropoulos, D., Gao, G., Ghosh, S., Gore, S., Guranovic, V., Hendrickx, P. M. S., Hudson, B. P., Ikegawa, Y., Kengaku, Y., Lawson, C. L., Liang, Y., Mak, L., Mukhopadhyay, A., Narayanan, B., Nishiyama, K., Patwardhan, A., Sahni, G., Sanz-Garcia, E., Sato, J., Sekharan, M. R., Shao, C., Smart, O. S., Tan, L., Van Ginkel, G., Yang, H., Zhuravleva, M. A., Markley, J. L., Nakamura, H., Kurisu, G., Kleywegt, G. J., Velankar, S., Berman, H. M., and Burley, S. K.
- Subjects
0301 basic medicine ,Vocabulary ,Data curation ,Protein Conformation ,Extramural ,Computer science ,media_common.quotation_subject ,MEDLINE ,computer.file_format ,Protein Data Bank ,Data science ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,030104 developmental biology ,Vocabulary, Controlled ,Structural biology ,Original Article ,Quality (business) ,Databases, Protein ,General Agricultural and Biological Sciences ,computer ,Data Curation ,Information Systems ,media_common - Abstract
The Protein Data Bank (PDB) is the single global repository for experimentally determined 3D structures of biological macromolecules and their complexes with ligands. The worldwide PDB (wwPDB) is the international collaboration that manages the PDB archive according to the FAIR principles: Findability, Accessibility, Interoperability and Reusability. The wwPDB recently developed OneDep, a unified tool for deposition, validation and biocuration of structures of biological macromolecules. All data deposited to the PDB undergo critical review by wwPDB Biocurators. This article outlines the importance of biocuration for structural biology data deposited to the PDB and describes wwPDB biocuration processes and the role of expert Biocurators in sustaining a high-quality archive. Structural data submitted to the PDB are examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources and validated for scientific/technical accuracy. We illustrate how biocuration is integral to PDB data archiving, as it facilitates accurate, consistent and comprehensive representation of biological structure data, allowing efficient and effective usage by research scientists, educators, students and the curious public worldwide. Database URL: https://www.wwpdb.org/
- Published
- 2018
98. The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank
- Author
-
Chenghua Shao, Sameer Velankar, Marina Zhuravleva, Zukang Feng, Jasmine Young, and John D. Westbrook
- Subjects
Statistics and Probability ,Macromolecular Substances ,Computer science ,Chemical nomenclature ,Protein Data Bank (RCSB PDB) ,Ligands ,computer.software_genre ,Biochemistry ,Dictionaries, Chemical as Topic ,User-Computer Interface ,Molecule ,Nucleotide ,Databases, Protein ,Molecular Biology ,chemistry.chemical_classification ,Internet ,Database ,urogenital system ,Ligand ,Molecular Sequence Annotation ,computer.file_format ,equipment and supplies ,Protein Data Bank ,Original Papers ,Small molecule ,Computer Science Applications ,Amino acid ,Computational Mathematics ,Crystallography ,Computational Theory and Mathematics ,chemistry ,computer ,Databases, Chemical ,Macromolecule - Abstract
Summary: The Chemical Component Dictionary (CCD) is a chemical reference data resource that describes all residue and small molecule components found in Protein Data Bank (PDB) entries. The CCD contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors, systematic chemical names and idealized coordinates. The content, preparation, validation and distribution of this CCD chemical reference dataset are described. Availability and implementation: The CCD is updated regularly in conjunction with the scheduled weekly release of new PDB structure data. The CCD and amino acid variant reference datasets are hosted in the public PDB ftp repository at ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif.gz, ftp://ftp.wwpdb.org/pub/pdb/data/monomers/aa-variants-v1.cif.gz, and its mirror sites, and can be accessed from http://wwpdb.org. Contact: jwest@rcsb.rutgers.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2014
99. PDB-Dev: a Prototype System for Depositing Integrative/Hybrid Structural Models
- Author
-
John L. Markley, Andrej Sali, Helen M. Berman, Torsten Schwede, Stephen K. Burley, Jill Trewhella, Sameer Velankar, Haruki Nakamura, and Genji Kurisu
- Subjects
0301 basic medicine ,Models, Molecular ,Internet ,Computer science ,business.industry ,Task force ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Proteins ,Article ,03 medical and health sciences ,User-Computer Interface ,030104 developmental biology ,0302 clinical medicine ,Structural Biology ,030212 general & internal medicine ,Software engineering ,business ,Databases, Protein ,Molecular Biology - Abstract
Burley et al. (leadership of the Worldwide PDB [wwPDB] Partnership [wwpdb.org] and the wwPDB Integrative/Hybrid Methods Task Force) announce public release of a prototype system for depositing integrative/hybrid structural models, PDB-Development (PDB-Dev; https://pdb-dev.wwpdb.org).
- Published
- 2017
100. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive
- Author
-
Gerard J. Kleywegt, Sameer Velankar, John L. Markley, Helen M. Berman, Haruki Nakamura, and Stephen K. Burley
- Subjects
0301 basic medicine ,Structure (mathematical logic) ,Models, Molecular ,Information retrieval ,Computer science ,Macromolecular Substances ,Protein Conformation ,International Cooperation ,Protein Data Bank (RCSB PDB) ,Proteins ,Stereoisomerism ,computer.file_format ,Mass spectrometry ,Protein Data Bank ,Crystallography, X-Ray ,Data science ,Article ,03 medical and health sciences ,Microscopy, Electron ,030104 developmental biology ,Humans ,Databases, Protein ,computer ,Nuclear Magnetic Resonance, Biomolecular ,Macromolecule - Abstract
The Protein Data Bank (PDB)--the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes--was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.