71 results on '"Genki Terashi"'
Search Results
2. Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction
- Author
-
Aashish Jain, Genki Terashi, Yuki Kagaya, Sai Raghavendra Maddhuri Venkata Subramaniya, Charles Christoffer, and Daisuke Kihara
- Subjects
Medicine ,Science - Abstract
Abstract Protein 3D structure prediction has advanced significantly in recent years due to improving contact prediction accuracy. This improvement has been largely due to deep learning approaches that predict inter-residue contacts and, more recently, distances using multiple sequence alignments (MSAs). In this work we present AttentiveDist, a novel approach that uses different MSAs generated with different E-values in a single model to increase the co-evolutionary information provided to the model. To determine the importance of each MSA’s feature at the inter-residue level, we added an attention layer to the deep neural network. We show that combining four MSAs of different E-value cutoffs improved the model prediction performance as compared to single E-value MSA features. A further improvement was observed when an attention layer was used and even more when additional prediction tasks of bond angle predictions were added. The improvement of distance predictions were successfully transferred to achieve better protein tertiary structure modeling.
- Published
- 2021
- Full Text
- View/download PDF
3. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning
- Author
-
Xiao Wang, Eman Alnabati, Tunde W. Aderinwale, Sai Raghavendra Maddhuri Venkata Subramaniya, Genki Terashi, and Daisuke Kihara
- Subjects
Science - Abstract
It is challenging to extract structural information from EM density maps at intermediate or low resolutions. Here, the authors present Emap2sec+, a program for detecting nucleotides and protein secondary structures in EM density maps at 5 to 10 Å resolution.
- Published
- 2021
- Full Text
- View/download PDF
4. VESPER: global and local cryo-EM map alignment using local density vectors
- Author
-
Xusi Han, Genki Terashi, Charles Christoffer, Siyang Chen, and Daisuke Kihara
- Subjects
Science - Abstract
Here, the authors present VESPER, a program for EM density map search and alignment. Using benchmark datasets, they demonstrate that VESPER performs accurate global and local alignments and comparisons of EM maps.
- Published
- 2021
- Full Text
- View/download PDF
5. MarkovFit: Structure Fitting for Protein Complexes in Electron Microscopy Maps Using Markov Random Field
- Author
-
Eman Alnabati, Juan Esquivel-Rodriguez, Genki Terashi, and Daisuke Kihara
- Subjects
protein modeling ,cryo-EM ,Markov random field ,structure fitting ,protein structure prediction ,Biology (General) ,QH301-705.5 - Abstract
An increasing number of protein complex structures are determined by cryo-electron microscopy (cryo-EM). When individual protein structures have been determined and are available, an important task in structure modeling is to fit the individual structures into the density map. Here, we designed a method that fits the atomic structures of proteins in cryo-EM maps of medium to low resolutions using Markov random fields, which allows probabilistic evaluation of fitted models. The accuracy of our method, MarkovFit, performed better than existing methods on datasets of 31 simulated cryo-EM maps of resolution 10 Å, nine experimentally determined cryo-EM maps of resolution less than 4 Å, and 28 experimentally determined cryo-EM maps of resolution 6 to 20 Å.
- Published
- 2022
- Full Text
- View/download PDF
6. De novo main-chain modeling for EM maps using MAINMAST
- Author
-
Genki Terashi and Daisuke Kihara
- Subjects
Science - Abstract
Main-chain tracing remains a time-consuming task for medium resolution cryo-EM maps. Here the authors describe MAINMAST, a computational approach for building main-chain structure models of proteins from EM maps of 4-5 Å resolution that builds main-chain models of the protein by tracing local dense points in the density distribution.
- Published
- 2018
- Full Text
- View/download PDF
7. Modeling the assembly order of multimeric heteroprotein complexes.
- Author
-
Lenna X Peterson, Yoichiro Togawa, Juan Esquivel-Rodriguez, Genki Terashi, Charles Christoffer, Amitava Roy, Woong-Hee Shin, and Daisuke Kihara
- Subjects
Biology (General) ,QH301-705.5 - Abstract
Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.
- Published
- 2018
- Full Text
- View/download PDF
8. Modeling disordered protein interactions from biophysical principles.
- Author
-
Lenna X Peterson, Amitava Roy, Charles Christoffer, Genki Terashi, and Daisuke Kihara
- Subjects
Biology (General) ,QH301-705.5 - Abstract
Disordered protein-protein interactions (PPIs), those involving a folded protein and an intrinsically disordered protein (IDP), are prevalent in the cell, including important signaling and regulatory pathways. IDPs do not adopt a single dominant structure in isolation but often become ordered upon binding. To aid understanding of the molecular mechanisms of disordered PPIs, it is crucial to obtain the tertiary structure of the PPIs. However, experimental methods have difficulty in solving disordered PPIs and existing protein-protein and protein-peptide docking methods are not able to model them. Here we present a novel computational method, IDP-LZerD, which models the conformation of a disordered PPI by considering the biophysical binding mechanism of an IDP to a structured protein, whereby a local segment of the IDP initiates the interaction and subsequently the remaining IDP regions explore and coalesce around the initial binding site. On a dataset of 22 disordered PPIs with IDPs up to 69 amino acids, successful predictions were made for 21 bound and 18 unbound receptors. The successful modeling provides additional support for biophysical principles. Moreover, the new technique significantly expands the capability of protein structure modeling and provides crucial insights into the molecular mechanisms of disordered PPIs.
- Published
- 2017
- Full Text
- View/download PDF
9. CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area.
- Author
-
Genki Terashi and Mayuko Takeda-Shitaka
- Subjects
Medicine ,Science - Abstract
Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue-residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue-residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.
- Published
- 2015
- Full Text
- View/download PDF
10. Bioinformatic Approaches for Characterizing Molecular Structure and Function of Food Proteins
- Author
-
Harrison Helmick, Anika Jain, Genki Terashi, Andrea Liceaga, Arun K. Bhunia, Daisuke Kihara, and Jozef L. Kokini
- Subjects
Food Science - Abstract
Structural bioinformatics analyzes protein structural models with the goal of uncovering molecular drivers of food functionality. This field aims to develop tools that can rapidly extract relevant information from protein databases as well as organize this information for researchers interested in studying protein functionality. Food bioinformaticians take advantage of millions of protein amino acid sequences and structures contained within these databases, extracting features such as surface hydrophobicity that are then used to model functionality, including solubility, thermostability, and emulsification. This work is aided by a protein structure–function relationship framework, in which bioinformatic properties are linked to physicochemical experimentation. Strong bioinformatic correlations exist for protein secondary structure, electrostatic potential, and surface hydrophobicity. Modeling changes in protein structures through molecular mechanics is an increasingly accessible field that will continue to propel food science research.
- Published
- 2023
- Full Text
- View/download PDF
11. DAQ-refine: Protein structure model evaluation and refinement for cryo-EM maps
- Author
-
Genki Terashi, Xiao Wang, and Daisuke Kihara
- Subjects
Biophysics - Published
- 2023
- Full Text
- View/download PDF
12. Protein model refinement for cryo-EM maps using AlphaFold2 and the DAQ score
- Author
-
Genki Terashi, Xiao Wang, and Daisuke Kihara
- Subjects
Models, Molecular ,Structural Biology ,Protein Conformation ,Cryoelectron Microscopy ,Proteins - Abstract
As more protein structure models have been determined from cryogenic electron microscopy (cryo-EM) density maps, establishing how to evaluate the model accuracy and how to correct models in cases where they contain errors is becoming crucial to ensure the quality of the structural models deposited in the public database, the PDB. Here, a new protocol is presented for evaluating a protein model built from a cryo-EM map and applying local structure refinement in the case where the model has potential errors. Firstly, model evaluation is performed using a deep-learning-based model–local map assessment score, DAQ, that has recently been developed. The subsequent local refinement is performed by a modified AlphaFold2 procedure, in which a trimmed template model and a trimmed multiple sequence alignment are provided as input to control which structure regions to refine while leaving other more confident regions of the model intact. A benchmark study showed that this protocol, DAQ-refine, consistently improves low-quality regions of the initial models. Among 18 refined models generated for an initial structure, DAQ shows a high correlation with model quality and can identify the best accurate model for most of the tested cases. The improvements obtained by DAQ-refine were on average larger than other existing methods.
- Published
- 2022
13. Efficient Flexible Fitting Refinement with Automatic Error Fixing for De Novo Structure Modeling from Cryo-EM Density Maps
- Author
-
Daisuke Kihara, Takaharu Mori, Genki Terashi, Yuji Sugita, and Daisuke Matsuoka
- Subjects
010304 chemical physics ,Protein Conformation ,Computer science ,Cryo-electron microscopy ,General Chemical Engineering ,Cryoelectron Microscopy ,Structure (category theory) ,Proteins ,General Chemistry ,Molecular Dynamics Simulation ,Library and Information Sciences ,Overfitting ,01 natural sciences ,0104 chemical sciences ,Computer Science Applications ,Progressive refinement ,010404 medicinal & biomolecular chemistry ,Molecular dynamics ,Structural biology ,0103 physical sciences ,Simulated annealing ,Protein structure modeling ,Algorithm - Abstract
Structural modeling of proteins from cryo-electron microscopy (cryo-EM) density maps is one of the challenging issues in structural biology. De novo modeling combined with flexible fitting refinement (FFR) has been widely used to build a structure of new proteins. In de novo prediction, artificial conformations containing local structural errors such as chirality errors, cis peptide bonds, and ring penetrations are frequently generated and cannot be easily removed in the subsequent FFR. Moreover, refinement can be significantly suppressed due to the low mobility of atoms inside the protein. To overcome these problems, we propose an efficient scheme for FFR, in which the local structural errors are fixed first, followed by FFR using an iterative simulated annealing (SA) molecular dynamics protocol with the united atom (UA) model in an implicit solvent model; we call this scheme "SAUA-FFR". The best model is selected from multiple flexible fitting runs with various biasing force constants to reduce overfitting. We apply our scheme to the decoys obtained from MAINMAST and demonstrate an improvement of the best model of eight selected proteins in terms of the root-mean-square deviation, MolProbity score, and RWplus score compared to the original scheme of MAINMAST. Fixing the local structural errors can enhance the formation of secondary structures, and the UA model enables progressive refinement compared to the all-atom model owing to its high mobility in the implicit solvent. The SAUA-FFR scheme realizes efficient and accurate protein structure modeling from medium-resolution maps with less overfitting.
- Published
- 2021
- Full Text
- View/download PDF
14. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning
- Author
-
Tunde Aderinwale, Xiao Wang, Eman Alnabati, Daisuke Kihara, Sai Raghavendra Maddhuri Venkata Subramaniya, and Genki Terashi
- Subjects
0301 basic medicine ,Models, Molecular ,Cryo-electron microscopy ,Science ,Biophysics ,General Physics and Astronomy ,computer.software_genre ,01 natural sciences ,Convolutional neural network ,General Biochemistry, Genetics and Molecular Biology ,Article ,Protein Structure, Secondary ,03 medical and health sciences ,Computational biophysics ,Protein structure ,Deep Learning ,Voxel ,Cryoelectron microscopy ,Protein secondary structure ,Physics ,Multidisciplinary ,010405 organic chemistry ,business.industry ,Deep learning ,Resolution (electron density) ,RNA ,Computational Biology ,General Chemistry ,DNA ,0104 chemical sciences ,030104 developmental biology ,Nucleic Acid Conformation ,Artificial intelligence ,Biological system ,business ,computer ,Software ,Macromolecule - Abstract
An increasing number of density maps of macromolecular structures, including proteins and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps., It is challenging to extract structural information from EM density maps at intermediate or low resolutions. Here, the authors present Emap2sec+, a program for detecting nucleotides and protein secondary structures in EM density maps at 5 to 10 Å resolution.
- Published
- 2021
15. Protein Structural Modeling for Electron Microscopy Maps Using VESPER and MAINMAST
- Author
-
Eman Alnabati, Genki Terashi, and Daisuke Kihara
- Subjects
Models, Molecular ,Models, Structural ,Medical Laboratory Technology ,Microscopy, Electron ,General Immunology and Microbiology ,General Neuroscience ,Cryoelectron Microscopy ,Proteins ,Health Informatics ,General Pharmacology, Toxicology and Pharmaceutics ,General Biochemistry, Genetics and Molecular Biology - Abstract
An increasing number of protein structures are determined by cryo-electron microscopy (cryo-EM) and stored in the Electron Microscopy Data Bank (EMDB). To interpret determined cryo-EM maps, several methods have been developed that model the tertiary structure of biomolecules, particularly proteins. Here we show how to use two such methods, VESPER and MAINMAST, which were developed in our group. VESPER is a method mainly for two purposes: fitting protein structure models into an EM map and aligning two EM maps locally or globally to capture their similarity. VESPER represents each EM map as a set of vectors pointing toward denser points. By considering matching the directions of vectors, in general, VESPER aligns maps better than conventional methods that only consider local densities of maps. MAINMAST is a de novo protein modeling tool designed for EM maps with resolution of 3-5 Å or better. MAINMAST builds a protein main chain directly from a density map by tracing dense points in an EM map and connecting them using a tree-graph structure. This article describes how to use these two tools using three illustrative modeling examples. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Protein structure model fitting using VESPER Alternate Protocol: Atomic model fitting using VESPER web server Basic Protocol 2: Protein de novo modeling using MAINMAST.
- Published
- 2022
16. Genotype & phenotype in Lowe Syndrome: specificOCRL1patient mutations differentially impact cellular phenotypes
- Author
-
R. Claudio Aguilar, Genki Terashi, Swetha Ramadesikan, Agustina De La Fuente, Daisuke Kihara, Lisette Skiba, Claudia B. Hanna, Tony R. Hazbun, Kayalvizhi Madhivanan, Jennifer Lee, and Daipayan Sarkar
- Subjects
Models, Molecular ,Protein Conformation ,Oculocerebrorenal syndrome ,Phosphatase ,Disease ,Biology ,medicine.disease_cause ,Cell Line ,03 medical and health sciences ,Genotype ,Genetics ,medicine ,Humans ,Computer Simulation ,Molecular Biology ,Gene ,Genetics (clinical) ,030304 developmental biology ,0303 health sciences ,Mutation ,030305 genetics & heredity ,Genetic disorder ,General Medicine ,medicine.disease ,Protein subcellular localization prediction ,Phenotype ,Phosphoric Monoester Hydrolases ,Protein Transport ,HEK293 Cells ,Oculocerebrorenal Syndrome ,General Article - Abstract
Lowe Syndrome (LS) is a lethal genetic disorder caused by mutations in theOCRL1gene which encodes the lipid 5’ phosphatase Ocrl1. Patients exhibit a characteristic triad of symptoms including eyes, brain and kidneys abnormalities with renal failure as the most common cause of premature death. Over 200OCRL1mutations have been identified in LS, but their specific impact on cellular processes is unknown. Despite observations of heterogeneity in patient symptom severity, there is little understanding of the correlation between genotype and its impact on phenotype.Here, we show that different mutations had diverse effects on protein localization and on triggering LS cellular phenotypes. In addition, some mutations affecting specific domains imparted unique characteristics to the resulting mutated protein. We also propose that certain mutations conformationally affect the 5’-phosphatase domain of the protein, resulting in loss of enzymatic activity and causing common and specific phenotypes.This study is the first to show the differential effect of patient 5’-phosphatase mutations on cellular phenotypes and introduces a conformational disease component in LS. This work provides a framework that can help stratify patients as well as to produce a more accurate prognosis depending on the nature and location of the mutation within theOCRL1gene.
- Published
- 2021
- Full Text
- View/download PDF
17. Surface-based protein domains retrieval methods from a SHREC2021 challenge
- Author
-
Florent Langenfeld, Tunde Aderinwale, Charles Christoffer, Woong-Hee Shin, Genki Terashi, Xiao Wang, Daisuke Kihara, Halim Benhabiles, Karim Hammoudi, Adnane Cabani, Feryal Windal, Mahmoud Melkemi, Ekpo Otu, Reyer Zwiggelaar, David Hunter, Yonghuai Liu, Léa Sirugue, Huu-Nghia H. Nguyen, Tuan-Duy H. Nguyen, Vinh-Thuyen Nguyen-Truong, Danh Le, Hai-Dang Nguyen, Minh-Triet Tran, Matthieu Montès, Laboratoire Génomique, bioinformatique et chimie moléculaire (GBCM), Conservatoire National des Arts et Métiers [CNAM] (CNAM), HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers université (HESAM)-HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers université (HESAM), Department of Computer Science [Purdue], Purdue University [West Lafayette], Suncheon National University [Suncheon, Corée du Sud], Institut d’Électronique, de Microélectronique et de Nanotechnologie - UMR 8520 (IEMN), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Université Polytechnique Hauts-de-France (UPHF)-JUNIA (JUNIA), Université catholique de Lille (UCL)-Université catholique de Lille (UCL), Bio-Micro-Electro-Mechanical Systems - IEMN (BIOMEMS - IEMN), Université catholique de Lille (UCL)-Université catholique de Lille (UCL)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Université Polytechnique Hauts-de-France (UPHF)-JUNIA (JUNIA), JUNIA (JUNIA), Université catholique de Lille (UCL), Institut de Recherche en Informatique Mathématiques Automatique Signal - IRIMAS - UR 7499 (IRIMAS), Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA)), Université de Strasbourg (UNISTRA), École Supérieure d’Ingénieurs en Génie Électrique (ESIGELEC), Aberystwyth University, Edge Hill University, Vietnam National University - Ho Chi Minh City (VNU-HCM), and Léa Sirugue, Matthieu Montès and Florent Langenfeld are supported by the European Research Council Executive Agency under the research grant number 640,283. Daisuke Kihara acknowledges supports from the National Institutes of Health (R01GM133840, R01GM123055) and the National Science Foundation (DBI2003635, CMMI1825941, and MCB1925643). Charles Christoffer is supported by NIGMS-funded pre–doctoral fellowship (T32 GM132024). Huu-Nghia H. Nguyen, Tuan-Duy H. Nguyen, Vinh-Thuyen Nguyen-Truong, Danh Le, Hai-Dang Nguyen, and Minh-Triet Tran are supported by National University Ho Chi Minh City (VNU-HCM) (DS2020-42-01).
- Subjects
Models, Molecular ,Static Electricity ,Proteins ,Ligands ,Computer Graphics and Computer-Aided Design ,Article ,Proteins surface ,[SPI]Engineering Sciences [physics] ,SHREC2021 ,Protein Domains ,Materials Chemistry ,Physical and Theoretical Chemistry ,Spectroscopy ,2000 MSC: 92-08 - Abstract
publication dans une revue suite à la communication hal-03467479 (SHREC 2021: surface-based protein domains retrieval); International audience; Proteins are essential to nearly all cellular mechanism and the effectors of the cells activities. As such, they often interact through their surface with other proteins or other cellular ligands such as ions or organic molecules. The evolution generates plenty of different proteins, with unique abilities, but also proteins with related functions hence similar 3D surface properties (shape, physico-chemical properties, …). The protein surfaces are therefore of primary importance for their activity. In the present work, we assess the ability of different methods to detect such similarities based on the geometry of the protein surfaces (described as 3D meshes), using either their shape only, or their shape and the electrostatic potential (a biologically relevant property of proteins surface). Five different groups participated in this contest using the shape-only dataset, and one group extended its pre-existing method to handle the electrostatic potential. Our comparative study reveals both the ability of the methods to detect related proteins and their difficulties to distinguish between highly related proteins. Our study allows also to analyze the putative influence of electrostatic information in addition to the one of protein shapes alone. Finally, the discussion permits to expose the results with respect to ones obtained in the previous contests for the extended method. The source codes of each presented method have been made available online.
- Published
- 2022
- Full Text
- View/download PDF
18. DAQ-score database: Deep-learning based quality estimation of cryo-EM derived protein models
- Author
-
Tsukasa Nakamura, Xiao Wang, Genki Terashi, and Daisuke Kihara
- Subjects
Biophysics - Published
- 2023
- Full Text
- View/download PDF
19. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment
- Author
-
Xiaoqin Zou, Théo Mauri, Hang Shi, Shaowen Zhu, Justas Dapkūnas, Yuanfei Sun, Didier Barradas-Bautista, Raphael A. G. Chaleil, Ragul Gowthaman, Sohee Kwon, Xianjin Xu, Zuzana Jandova, Genki Terashi, Ryota Ashizawa, Petras J. Kundrotas, Shuang Zhang, Tunde Aderinwale, Jian Liu, Sandor Vajda, Paul A. Bates, Jianlin Cheng, Daisuke Kihara, Luis A. Rodríguez-Lumbreras, Carlos A. Del Carpio Muñoz, Liming Qiu, Guillaume Brysbaert, Jorge Roel-Touris, Česlovas Venclovas, Tereza Clarence, Rui Yin, Amar Singh, Patryk A. Wesołowski, Rafał Ślusarz, Adam Liwo, Guangbo Yang, Agnieszka S. Karczyńska, Yoshiki Harada, Sergei Kotelnikov, Yuya Hanazono, Charlotte W. van Noort, Marc F. Lensink, Jonghun Won, Adam K. Sieradzan, Israel Desta, Xufeng Lu, Charles Christoffer, Anna Antoniak, Taeyong Park, Sheng-You Huang, Tsukasa Nakamura, Brian G. Pierce, Usman Ghani, Yang Shen, Luigi Cavallo, Chaok Seok, Hao Li, Nurul Nadzirin, Ghazaleh Taherzadeh, Jacob Verburgt, Rodrigo V. Honorato, Artur Giełdoń, Jeffrey J. Gray, Dima Kozakov, Ming Liu, Shan Chang, Eiichiro Ichiishi, Manon Réau, Rui Duan, Francesco Ambrosetti, Johnathan D. Guest, Juan Fernández-Recio, Alexandre M. J. J. Bonvin, Ilya A. Vakser, Farhan Quadir, Yumeng Yan, Ren Kong, Sameer Velankar, Sergei Grudinin, Mateusz Kogut, Mikhail Ignatov, Yasuomi Kiyota, Hyeonuk Woo, Shoshana J. Wodak, Ameya Harmalkar, Shinpei Kobayashi, Panagiotis I. Koukos, Zhen Cao, Kliment Olechnovič, Cezary Czaplewski, Xiao Wang, Agnieszka G. Lipska, Kathryn A. Porter, Peicong Lin, Emilia A. Lubecka, Nasser Hashemi, Bin Liu, Mayuko Takeda-Shitaka, Karolina Zięba, Dzmitry Padhorny, Zhuyezi Sun, Daipayan Sarkar, Romina Oliva, Andrey Alekseenko, Siri Camee van Keulen, Mireia Rosell, Raj S. Roy, Brian Jiménez-García, Jinsol Yang, Martyna Maszota-Zieleniak, Cancer Research UK, Department of Energy and Climate Change (UK), European Commission, Institut National de Recherche en Informatique et en Automatique (France), Medical Research Council (UK), Japan Society for the Promotion of Science, Ministerio de Ciencia, Innovación y Universidades (España), Agencia Estatal de Investigación (España), National Institute of General Medical Sciences (US), National Institutes of Health (US), National Natural Science Foundation of China, National Science Foundation (US), Unité de Glycobiologie Structurale et Fonctionnelle (UGSF), Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Biomolecular Modelling Laboratory [London], The Francis Crick Institute [London], Jiangsu University of Technology [Changzhou], Department of Electrical Engineering and Computer Science [Columbia] (EECS), University of Missouri [Columbia] (Mizzou), University of Missouri System-University of Missouri System, Institute for Data Science and Informatics [Columbia], University of Gdańsk (UG), Faculty of Electronics, Telecommunications and Informatics [GUT Gdańsk] (ETI), Gdańsk University of Technology (GUT), Medical University of Gdańsk, Graduate School of Medical Sciences [Nagoya], Nagoya City University [Nagoya, Japan], International University of Health and Welfare Hospital (IUHW Hospital), Department of Chemical and Biomolecular Engineering [Baltimore], Johns Hopkins University (JHU), Bijvoet Center of Biomolecular Research [Utrecht], Utrecht University [Utrecht], Stony Brook University [SUNY] (SBU), State University of New York (SUNY), Innopolis University, Boston University [Boston] (BU), Russian Academy of Sciences [Moscow] (RAS), Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Universidad de La Rioja (UR), Algorithms for Modeling and Simulation of Nanosystems (NANO-D), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK), Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), Données, Apprentissage et Optimisation (DAO), Laboratoire Jean Kuntzmann (LJK), Université Grenoble Alpes (UGA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Huazhong University of Science and Technology [Wuhan] (HUST), Indiana University - Purdue University Indianapolis (IUPUI), Indiana University System, Graduate School of Information Sciences [Sendaï], Tohoku University [Sendai], National Institutes for Quantum and Radiological Science and Technology (QST), University of Maryland [Baltimore], King Abdullah University of Science and Technology (KAUST), University of Naples Federico II, Texas A&M University [Galveston], Seoul National University [Seoul] (SNU), Kitasato University, University of Kansas [Lawrence] (KU), Vilnius University [Vilnius], University of Missouri System, VIB-VUB Center for Structural Biology [Bruxelles], VIB [Belgium], Sub NMR Spectroscopy, Sub Overig UiLOTS, Sub Mathematics Education, NMR Spectroscopy, Université de Lille, CNRS, Unité de Glycobiologie Structurale et Fonctionnelle (UGSF) - UMR 8576, European Bioinformatics Institute [Hinxton] [EMBL-EBI], Department of Electrical Engineering and Computer Science [Columbia] [EECS], Faculty of Chemistry [Univ Gdańsk], Faculty of Electronics, Telecommunications and Informatics [GUT Gdańsk] [ETI], International University of Health and Welfare Hospital [IUHW Hospital], Johns Hopkins University [JHU], Stony Brook University [SUNY] [SBU], Department of Biomedical Engineering [Boston], Instituto de Ciencias de la Vid y el Vino [ICVV], Huazhong University of Science and Technology [Wuhan] [HUST], Indiana University - Purdue University Indianapolis [IUPUI], National Institutes for Quantum and Radiological Science and Technology [QST], King Abdullah University of Science and Technology [KAUST], Università degli Studi di Napoli 'Parthenope' = University of Naples [PARTHENOPE], Seoul National University [Seoul] [SNU], University of Kansas [Lawrence] [KU], University of Missouri [Columbia] [Mizzou], Unité de Glycobiologie Structurale et Fonctionnelle - UMR 8576 (UGSF), Université de Lille-Centre National de la Recherche Scientifique (CNRS), University of Naples Federico II = Università degli studi di Napoli Federico II, European Project: 675728,H2020,H2020-EINFRA-2015-1,BioExcel(2015), European Project: 823830,H2020-EU.1.4.1.3. Development, deployment and operation of ICT-based e-infrastructures, H2020-EU.1.4. EXCELLENT SCIENCE - Research Infrastructures ,BioExcel-2(2019), European Project: 777536,H2020-EU.1.4.1.3. Development, deployment and operation of ICT-based e-infrastructures, and H2020-EU.1.4. EXCELLENT SCIENCE - Research Infrastructures,EOSC-hub(2018)
- Subjects
Models, Molecular ,blind prediction ,CAPRI ,CASP ,docking ,oligomeric state ,protein assemblies ,protein complexes ,protein docking ,protein–protein interaction ,template-based modeling ,Computer science ,[SDV]Life Sciences [q-bio] ,Machine learning ,computer.software_genre ,Biochemistry ,Article ,protein-protein interaction ,03 medical and health sciences ,Sequence Analysis, Protein ,Structural Biology ,Server ,Protein Interaction Domains and Motifs ,Molecular Biology ,ComputingMilieux_MISCELLANEOUS ,030304 developmental biology ,0303 health sciences ,Binding Sites ,business.industry ,030302 biochemistry & molecular biology ,Computational Biology ,Proteins ,3. Good health ,Molecular Docking Simulation ,Artificial intelligence ,business ,computer ,Software - Abstract
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands., Cancer Research UK, Grant/Award Number: FC001003; Changzhou Science and Technology Bureau, Grant/Award Number: CE20200503; Department of Energy and Climate Change, Grant/Award Numbers: DE-AR001213, DE-SC0020400, DE-SC0021303; H2020 European Institute of Innovation and Technology, Grant/Award Numbers: 675728, 777536, 823830; Institut national de recherche en informatique et en automatique (INRIA), Grant/Award Number: Cordi-S; Lietuvos Mokslo Taryba, Grant/Award Numbers: S-MIP-17-60, S-MIP-21-35; Medical Research Council, Grant/Award Number: FC001003; Japan Society for the Promotion of Science KAKENHI, Grant/Award Number: JP19J00950; Ministerio de Ciencia e Innovación, Grant/Award Number: PID2019-110167RB-I00; Narodowe Centrum Nauki, Grant/Award Numbers: UMO-2017/25/B/ST4/01026, UMO-2017/26/M/ST4/00044, UMO-2017/27/B/ST4/00926; National Institute of General Medical Sciences, Grant/Award Numbers: R21GM127952, R35GM118078, RM1135136, T32GM132024; National Institutes of Health, Grant/Award Numbers: R01GM074255, R01GM078221, R01GM093123, R01GM109980, R01GM133840, R01GN123055, R01HL142301, R35GM124952, R35GM136409; National Natural Science Foundation of China, Grant/Award Number: 81603152; National Science Foundation, Grant/Award Numbers: AF1645512, CCF1943008, CMMI1825941, DBI1759277, DBI1759934, DBI1917263, DBI20036350, IIS1763246, MCB1925643; NWO, Grant/Award Number: TOP-PUNT 718.015.001; Wellcome Trust, Grant/Award Number: FC001003
- Published
- 2021
- Full Text
- View/download PDF
20. Residue-wise local quality estimation for protein models from cryo-EM maps
- Author
-
Genki Terashi, Xiao Wang, Sai Raghavendra Maddhuri Venkata Subramaniya, John J. G. Tesmer, and Daisuke Kihara
- Subjects
Models, Molecular ,Protein Conformation ,Cryoelectron Microscopy ,Proteins ,Cell Biology ,Amino Acids ,Molecular Biology ,Biochemistry ,Protein Structure, Secondary ,Article ,Biotechnology - Abstract
An increasing number of protein structures are being determined by cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM density maps is improving in general, there are still many cases where amino acids of a protein are assigned with different levels of confidence. Here we developed a method that identifies potential misassignment of residues in the map, including residue shifts along an otherwise correct main-chain trace. The score, named DAQ, computes the likelihood that the local density corresponds to different amino acids, atoms, and secondary structures, estimated via deep learning, and assesses the consistency of the amino acid assignment in the protein structure model with that likelihood. When DAQ was applied to different versions of model structures in the Protein Data Bank that were derived from the same density maps, a clear improvement in the DAQ score was observed in the newer versions of the models. DAQ also found potential misassignment errors in a substantial number of deposited protein structure models built into cryo-EM maps.
- Published
- 2021
21. Real-Time Structure Search and Structure Classification for AlphaFold Protein Models
- Author
-
Tunde Aderinwale, Zicong Zhang, Rhashidedin Jahandideh, Genki Terashi, Daisuke Kihara, Yuki Kagaya, Charles Christoffer, and Vijay Bharadwaj
- Subjects
Models, Molecular ,Surface (mathematics) ,business.industry ,Computer science ,Zernike polynomials ,Structure (category theory) ,Medicine (miscellaneous) ,Proteins ,Pattern recognition ,Protein structure prediction ,General Biochemistry, Genetics and Molecular Biology ,symbols.namesake ,Software ,Protein structure ,symbols ,Protein model ,Neural Networks, Computer ,Artificial intelligence ,General Agricultural and Biological Sciences ,business ,Representation (mathematics) - Abstract
Last year saw a breakthrough in protein structure prediction, where the AlphaFold2 method showed a substantial improvement in the modeling accuracy. Following the software release of AlphaFold2, predicted structures by AlphaFold2 for proteins in 21 species were made publicly available via the AlphaFold Database. Here, to facilitate structural analysis and application of AlphaFold2 models, we provide the infrastructure, 3D-AF-Surfer, which allows real-time structure-based search for the AlphaFold2 models. In 3D-AF-Surfer, structures are represented with 3D Zernike descriptors (3DZD), which is a rotationally invariant, mathematical representation of 3D shapes. We developed a neural network that takes 3DZDs of proteins as input and retrieves proteins of the same fold more accurately than direct comparison of 3DZDs. Using 3D-AF-Surfer, we report structure classifications of AlphaFold2 models and discuss the correlation between confidence levels of AlphaFold2 models and intrinsic disordered regions.
- Published
- 2021
- Full Text
- View/download PDF
22. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps
- Author
-
Gaspard Debussche, Emad Tajkhorshid, Mrinal Shekhar, Chitrak Gupta, Jonathan Nguyen, Wade D. Van Horn, Alberto Perez, Abhishek Singharoy, Daisuke Kihara, John Vant, Nicholas J. Sisco, Arup Mondal, Genki Terashi, Ken A. Dill, Daipayan Sarkar, and Petra Fromme
- Subjects
Quantitative Biology::Biomolecules ,Ensemble forecasting ,Computer science ,Cryo-electron microscopy ,1.1 Normal biological development and functioning ,Resolution (electron density) ,Bioengineering ,Folding (DSP implementation) ,Python (programming language) ,1.4 Methodologies and measurements ,Article ,Molecular dynamics ,Protein structure ,Underpinning research ,2.1 Biological and endogenous factors ,General Materials Science ,Protein folding ,Generic health relevance ,Aetiology ,Biological system ,computer ,computer.programming_language - Abstract
Cryo-electron microscopy (EM) requires molecular modeling to refine structural details from data. Ensemble models arrive at low free-energy molecular structures, but are computationally expensive and limited to resolving only small proteins that cannot be resolved by cryo-EM. Here, we introduce CryoFold - a pipeline of molecular dynamics simulations that determines ensembles of protein structures directly from sequence by integrating density data of varying sparsity at 3-5 Å resolution with coarse-grained topological knowledge of the protein folds. We present six examples showing its broad applicability for folding proteins between 72 to 2000 residues, including large membrane and multi-domain systems, and results from two EMDB competitions. Driven by data from a single state, CryoFold discovers ensembles of common low-energy models together with rare low-probability structures that capture the equilibrium distribution of proteins constrained by the density maps. Many of these conformations, unseen by traditional methods, are experimentally validated and functionally relevant. We arrive at a set of best practices for data-guided protein folding that are controlled using a Python GUI.
- Published
- 2021
23. Geometrical Conversion of the EGFR Extracellular Domain by Adiabatic Mapping Combining Normal Mode Analysis of the Elastic Network Model and Energy Optimization
- Author
-
Hajime Matsubara, Yasuomi Kiyota, Genki Terashi, Hiroyuki Nojima, and Mayuko Takeda-Shitaka
- Subjects
Models, Molecular ,Chemistry ,Rigidity (psychology) ,General Chemistry ,General Medicine ,Crystal structure ,Crystallography, X-Ray ,Energy minimization ,Elasticity ,ErbB Receptors ,Chemical physics ,Normal mode ,Epidermal growth factor ,Drug Discovery ,Domain (ring theory) ,Extracellular ,Humans ,Protein Interaction Maps ,Adiabatic process - Abstract
The activation of epidermal growth factor receptor (EGFR) involves the geometrical conversion of the extracellular domain (ECD) from the tethered to the extended forms with the dynamic rearrangement of the relative positions of four subdomains (SDs); however, this conversion process has not yet been thoroughly understood. We compare the two different forms of the X-ray crystal structures of ECD and simulate the ECD conversion process using adiabatic mapping that combines normal mode analysis of the elastic network model (ENM-NMA) and energy optimization. A comparison of the crystal structures reveals the rigidity of the intradomain geometry of the SD-I and -III backbone regardless of the form. The forward mapping from the tethered to the extended forms retains the intradomain geometry of the SD-I and -III backbone and reveals the trends to rearrange the relative positions of SD-I and -III and to dissociate the C-terminal tail of SD-IV from the hairpin loop in SD-II. The reverse mapping from the extended to the tethered forms complements the promotion of ECD conversion in the presence of epidermal growth factor (EGF).
- Published
- 2019
- Full Text
- View/download PDF
24. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge
- Author
-
Martyn Winn, Maxim Igaev, Bohdan Monastyrskyy, Genki Terashi, Catherine L. Lawson, Mark A. Herzik, Jianlin Cheng, Michael F. Schmid, Renzhi Cao, Kevin Cowtan, Mateusz Olek, Dilip Kumar, Jonas Pfab, Stephanie A. Wankowicz, Wah Chiu, Luisa U. Schäfer, Paul D. Adams, Grigore D. Pintilie, Daipayan Sarkar, Sumit Mittal, Daisuke Kihara, Frank DiMaio, Zhe Wang, Tianqi Wu, Andriy Kryshtafovych, Tom Burnley, Mrinal Shekhar, Paul S. Bond, Gunnar F. Schröder, Li-Wei Hung, Andrea C. Vaiana, Ardan Patwardhan, Daniel P. Farrell, Liguo Wang, Ken A. Dill, Pavel V. Afonine, Jane S. Richardson, Agnel Praveen Joseph, Xiaodi Yu, Helen M. Berman, Singharoy A, Alberto Perez, Thomas C. Terwilliger, Kaiming Zhang, Jie Hou, Soon Wen Hoh, James S. Fraser, Dong Si, Peter B. Rosenthal, Colin M. Palmer, Benjamin A Barad, Matthew L. Baker, Grzegorz Chojnowski, and Christopher J. Williams
- Subjects
Models, Molecular ,Technology ,Statistical methods ,Computer science ,Protein Conformation ,computer.software_genre ,Crystallography, X-Ray ,Biochemistry ,Medical and Health Sciences ,Model validation ,0302 clinical medicine ,Software ,Models ,media_common ,0303 health sciences ,Crystallography ,Protein databases ,Biological Sciences ,Networking and Information Technology R&D ,Biotechnology ,Validation study ,Modeling software ,media_common.quotation_subject ,Context (language use) ,Bioengineering ,Machine learning ,03 medical and health sciences ,Benchmark (surveying) ,Quality (business) ,ddc:610 ,Molecular Biology ,030304 developmental biology ,Structure (mathematical logic) ,business.industry ,Cryoelectron Microscopy ,Molecular ,Proteins ,Cell Biology ,X-Ray ,Artificial intelligence ,Generic health relevance ,business ,computer ,030217 neurology & neurosurgery ,Analysis ,Developmental Biology - Abstract
This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density., A multi-laboratory study in the form of a community challenge assesses the quality of models that can be produced from cryo-EM maps using different software tools, the reproducibility of models generated by different users and the performance of metrics used for model validation.
- Published
- 2021
- Full Text
- View/download PDF
25. Super-Resolution Cryo-EM Maps With 3D Deep Generative Networks
- Author
-
Daisuke Kihara, Genki Terashi, and Sai Raghavendra Maddhuri Venkata Subramaniya
- Subjects
Range (mathematics) ,business.industry ,Cryo-electron microscopy ,Computer science ,Deep learning ,Resolution (electron density) ,Artificial intelligence ,business ,Superresolution ,Algorithm ,Generative grammar ,Macromolecule - Abstract
An increasing number of biological macromolecules have been solved with cryo-electron microscopy (cryo-EM). Over the past few years, the resolutions of density maps determined by cryo-EM have largely improved in general. However, there are still many cases where the resolution is not high enough to model molecular structures with standard computational tools. If the resolution obtained is near the empirical border line (3-4 Å), a small improvement of resolution will significantly facilitate structure modeling. Here, we report SuperEM, a novel deep learning-based method that uses a three-dimensional generative adversarial network for generating an improved-resolution EM map from an experimental EM map. SuperEM is designed to work with EM maps in the resolution range of 3 Å to 6 Å and has shown an average resolution improvement of 1.0 Å on a test dataset of 36 experimental maps. The generated super-resolution maps are shown to result in better structure modelling of proteins.
- Published
- 2021
- Full Text
- View/download PDF
26. AttentiveDist: Protein Inter-Residue Distance Prediction Using Deep Learning with Attention on Quadruple Multiple Sequence Alignments
- Author
-
Genki Terashi, Maddhuri Venkata Subramaniya, Yuki Kagaya, Charles Christoffer, Aashish Jain, and Daisuke Kihara
- Subjects
Artificial neural network ,Computer science ,business.industry ,Deep learning ,Pattern recognition ,Artificial intelligence ,business - Abstract
Protein 3D structure prediction has advanced significantly in recent years due to improving contact prediction accuracy. This improvement has been largely due to deep learning approaches that predict inter-residue contacts and, more recently, distances using multiple sequence alignments (MSAs). In this work we present AttentiveDist, a novel approach that uses different MSAs generated with different E-values in a single model to increase the co-evolutionary information provided to the model. To determine the importance of each MSA’s feature at the inter-residue level, we added an attention layer to the deep neural network. The model is trained in a multi-task fashion to also predict backbone and orientation angles further improving the inter-residue distance prediction. We show that AttentiveDist outperforms the top methods for contact prediction in the CASP13 structure prediction competition. To aid in structure modeling we also developed two new deep learning-based sidechain center distance and peptide-bond nitrogen-oxygen distance prediction models. Together these led to a 12% increase in TM-score from the best server method in CASP13 for structure prediction.
- Published
- 2020
- Full Text
- View/download PDF
27. Protein Contact Map Refinement for Improving Structure Prediction Using Generative Adversarial Networks
- Author
-
Sai Raghavendra Maddhuri Venkata Subramaniya, Yuki Kagaya, Aashish Jain, Genki Terashi, and Daisuke Kihara
- Subjects
Statistics and Probability ,Boosting (machine learning) ,Computer science ,Pipeline (computing) ,Protein contact map ,Machine learning ,computer.software_genre ,Biochemistry ,03 medical and health sciences ,0302 clinical medicine ,Molecular Biology ,030304 developmental biology ,Structure (mathematical logic) ,Supplementary data ,0303 health sciences ,business.industry ,Protein structure prediction ,Original Papers ,Protein tertiary structure ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery ,Generative grammar - Abstract
Motivation Protein structure prediction remains as one of the most important problems in computational biology and biophysics. In the past few years, protein residue–residue contact prediction has undergone substantial improvement, which has made it a critical driving force for successful protein structure prediction. Boosting the accuracy of contact predictions has, therefore, become the forefront of protein structure prediction. Results We show a novel contact map refinement method, ContactGAN, which uses Generative Adversarial Networks (GAN). ContactGAN was able to make a significant improvement over predictions made by recent contact prediction methods when tested on three datasets including protein structure modeling targets in CASP13 and CASP14. We show improvement of precision in contact prediction, which translated into improvement in the accuracy of protein tertiary structure models. On the other hand, observed improvement over trRosetta was relatively small, reasons for which are discussed. ContactGAN will be a valuable addition in the structure prediction pipeline to achieve an extra gain in contact prediction accuracy. Availability and implementation https://github.com/kiharalab/ContactGAN. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2020
28. Emap2sec+: Detecting Protein and DNA/RNA Structures in Cryo-EM Maps of Intermediate Resolution Using Deep Learning
- Author
-
Tunde Aderinwale, Genki Terashi, S. R. Maddhuri Venkata Subramaniya, Daisuke Kihara, Eman Alnabati, and Xiao Wang
- Subjects
Physics ,Cryo-electron microscopy ,Resolution (electron density) ,RNA ,computer.software_genre ,Convolutional neural network ,chemistry.chemical_compound ,chemistry ,Voxel ,Biological system ,computer ,Protein secondary structure ,DNA ,Macromolecule - Abstract
An increasing number of density maps of macromolecular structures, including proteins and protein and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps.
- Published
- 2020
- Full Text
- View/download PDF
29. Protein Contact Map Denoising Using Generative Adversarial Networks
- Author
-
Daisuke Kihara, Aashish Jain, Sai Raghavendra Maddhuri Venkata Subramaniya, Genki Terashi, and Yuki Kagaya
- Subjects
Structure (mathematical logic) ,Protein sequencing ,Computer science ,Protein contact map ,Pipeline (computing) ,Noise reduction ,Data mining ,Protein structure prediction ,computer.software_genre ,computer ,Protein tertiary structure ,Generative grammar - Abstract
Protein residue-residue contact prediction from protein sequence information has undergone substantial improvement in the past few years, which has made it a critical driving force for building correct protein tertiary structure models. Improving accuracy of contact predictions has, therefore, become the forefront of protein structure prediction. Here, we show a novel contact map denoising method, ContactGAN, which uses Generative Adversarial Networks (GAN) to refine predicted protein contact maps. ContactGAN was able to make a consistent and significant improvement over predictions made by recent contact prediction methods when tested on two datasets including protein structure modeling targets in CASP13. ContactGAN will be a valuable addition in the structure prediction pipeline to achieve an extra gain in contact prediction accuracy.
- Published
- 2020
- Full Text
- View/download PDF
30. Outcomes of the 2019 EMDataResource model challenge: validation of cryo-EM models at near-atomic resolution
- Author
-
Gunnar F. Schröder, Carmen J. Williams, Daisuke Kihara, Jonas Pfab, Tianqi Wu, Monastyrskyy B, Wang Z, Kevin Cowtan, Andrea C. Vaiana, Luisa U. Schäfer, Mark A. Herzik, Jianlin Cheng, Dilip Kumar, Renzhi Cao, Martyn Winn, Wah Chiu, Kryshtafovych A, Benjamin A Barad, Michael F. Schmid, Ken A. Dill, Genki Terashi, Singharoy A, Daniel P. Farrell, Li-Wei Hung, Pavel V. Afonine, Ardan Patwardhan, Stephanie A. Wankowicz, James S. Fraser, Jane S. Richardson, Paul D. Adams, Alberto Perez, Catherine L. Lawson, Mrinal Shekhar, Xiaodi Yu, Liguo Wang, Agnel Praveen Joseph, Paul S. Bond, Mateusz Olek, Colin M. Palmer, Helen M. Berman, Dong Si, Peter B. Rosenthal, Matthew L. Baker, Grzegorz Chojnowski, Grigore D. Pintilie, Thomas C. Terwilliger, Kaiming Zhang, Sumit Mittal, Jie Hou, Soon Wen Hoh, Depanjan Sarkar, Frank DiMaio, Maxim Igaev, and Tom Burnley
- Subjects
Structure (mathematical logic) ,Computer science ,business.industry ,media_common.quotation_subject ,Context (language use) ,computer.file_format ,Protein Data Bank ,computer.software_genre ,Software ,Atomic resolution ,Benchmark (surveying) ,Quality (business) ,Data mining ,Focus (optics) ,business ,computer ,media_common - Abstract
This paper describes outcomes of the 2019 Cryo-EM Map-based Model Metrics Challenge sponsored by EMDataResource (www.emdataresource.org). The goals of this challenge were (1) to assess the quality of models that can be produced using current modeling software, (2) to check the reproducibility of modeling results from different software developers and users, and (3) compare the performance of current metrics used for evaluation of models. The focus was on near-atomic resolution maps with an innovative twist: three of four target maps formed a resolution series (1.8 to 3.1 Å) from the same specimen and imaging experiment. Tools developed in previous challenges were expanded for managing, visualizing and analyzing the 63 submitted coordinate models, and several novel metrics were introduced. The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual laboratory experiments and holdings of structure data archives such as the Protein Data Bank. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived from these benchmark maps by 13 participating teams, representing both widely used and novel modeling approaches. We also evaluate the pros and cons of the commonly used metrics to assess model quality and recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed density in the cryo-EM map.
- Published
- 2020
- Full Text
- View/download PDF
31. Deep learning-based local quality estimation for protein structure models from cryo-EM maps
- Author
-
Genki Terashi, Xiao Wang, Sai Raghavendra Maddhuri Venkata Subramaniya, John J. Tesmer, and Daisuke Kihara
- Subjects
Biophysics - Published
- 2022
- Full Text
- View/download PDF
32. De novo main-chain modeling with MAINMAST in 2015/2016 EM Model Challenge
- Author
-
Daisuke Kihara and Genki Terashi
- Subjects
0301 basic medicine ,Map interpretation ,Computer science ,Protein Conformation ,Minimum spanning tree ,Article ,Interpretation (model theory) ,03 medical and health sciences ,0302 clinical medicine ,Software ,Chain (algebraic topology) ,Structural Biology ,Position (vector) ,MAINMAST ,Rosetta ,Electron microscopy ,Confidence score ,Protocol (object-oriented programming) ,Cryo-EM ,business.industry ,Cryoelectron Microscopy ,Proteins ,Longest path problem ,030104 developmental biology ,Main-chain trace ,CryoEM Model Challenge ,Protein structure modeling ,confidence score ,business ,Algorithm ,030217 neurology & neurosurgery ,Mean shifting algorithm - Abstract
Protein tertiary structure modeling is a critical step for the interpretation of three dimensional (3D) election microscopy density. Our group participated the 2015/2016 EM Model Challenge using the MAINMAST software for a de novo main chain modeling. The software generates local dense points using the mean shifting algorithm, and connects them into Cα models by calculating the minimum spanning tree and the longest path. Subsequently, full atom structure models are generated, which are subject to structural refinement. Here, we summarize the qualities of our submitted models and examine successful and unsuccessful models, including 3D models we did not submit to the Challenge. Our protocol using the MAINMAST software was sometimes able to build correct conformations with 3.4–5.1 A RMSD. Unsuccessful models had failure of chain traces, however, their Cα positions and some local structures were quite correctly built. For evaluate the quality of the models, the MAINMAST software provides a confidence score for each Cα position from the consensus of top 100 scoring models.
- Published
- 2018
33. SHREC2020 track: Multi-domain protein shape retrieval challenge
- Author
-
Florent Langenfeld, David Hunter, Matthieu Montes, Karim Hammoudi, Daisuke Kihara, Feryal Windal, Yu-Kun Lai, Ekpo Otu, Paul L. Rosin, Stelios K. Mylonas, Petros Daras, Apostolos Axenopoulos, Halim Benhabiles, Reyer Zwiggelaar, Andrea Giachetti, Charles Christoffer, Adnane Cabani, Tunde Aderinwale, Yuxu Peng, Yonghuai Liu, Mahmoud Melkemi, Genki Terashi, Cardiff Univ, Sch Chem, Cardiff CF10 3XQ, S Glam, Wales, Purdue University [West Lafayette], Institut d’Électronique, de Microélectronique et de Nanotechnologie - UMR 8520 (IEMN), Centrale Lille-Institut supérieur de l'électronique et du numérique (ISEN)-Université de Valenciennes et du Hainaut-Cambrésis (UVHC)-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Université Polytechnique Hauts-de-France (UPHF), Bio-Micro-Electro-Mechanical Systems - IEMN (BIOMEMS - IEMN), Centrale Lille-Institut supérieur de l'électronique et du numérique (ISEN)-Université de Valenciennes et du Hainaut-Cambrésis (UVHC)-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Université Polytechnique Hauts-de-France (UPHF)-Centrale Lille-Institut supérieur de l'électronique et du numérique (ISEN)-Université de Valenciennes et du Hainaut-Cambrésis (UVHC)-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Université Polytechnique Hauts-de-France (UPHF), Institut de Recherche en Informatique Mathématiques Automatique Signal (IRIMAS), Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA)), Université de Strasbourg (UNISTRA), Institut de Recherche en Systèmes Electroniques Embarqués (IRSEEM), Université de Rouen Normandie (UNIROUEN), Normandie Université (NU)-Normandie Université (NU)-École Supérieure d’Ingénieurs en Génie Électrique (ESIGELEC), Clinica Oculistica, Università degli Studi di Verona, Aberystwyth Univ, Inst Biol Environm & Rural Sci, Aberystwyth SY23 3EB, Dyfed, Wales, Young teachers growth plan project - Changsha University of Science Technology [2019QJCZ014], ATXN1-MED15 PPI project - GSRT Hellenic Foundation for Research and Innovation, European Research Council Executive Agency [640283], Laboratoire Génomique, bioinformatique et chimie moléculaire (GBCM), Conservatoire National des Arts et Métiers [CNAM] (CNAM), HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers université (HESAM)-HESAM Université - Communauté d'universités et d'établissements Hautes écoles Sorbonne Arts et métiers université (HESAM), School of Information Science and Engineering [Changsha], Central South University [Changsha], School of Computer Sciences & Informatics [Cardiff], Cardiff University, Università degli studi di Verona = University of Verona (UNIVR), Centre for Research and Technology Hellas (CERTH), Aberystwyth University, and Edge Hill University
- Subjects
Computer science ,3D shape analysis ,02 engineering and technology ,Computational biology ,Domain (software engineering) ,3D shape descriptor ,[SPI]Engineering Sciences [physics] ,Protein structure ,Species level ,0202 electrical engineering, electronic engineering, information engineering ,Protein structure comparison ,Protein shape ,business.industry ,Specific function ,3D shape matching ,General Engineering ,020207 software engineering ,Modular design ,Computer Graphics and Computer-Aided Design ,Human-Computer Interaction ,Multi domain ,SHREC ,3D shape retrieval ,020201 artificial intelligence & image processing ,business ,Scope (computer science) - Abstract
[#17491] article suite à une conférence orale: 13th EG Euroworkshop on 3D object retrieval, 3DOR 2020, Graz, Austria, september 4-5, 2020; International audience; Proteins are natural modular objects usually composed of several domains, each domain bearing a specific function that is mediated through its surface, which is accessible to vicinal molecules. This draws attention to an understudied characteristic of protein structures: surface, that is mostly unexploited by protein structure comparison methods. In the present work, we evaluated the performance of six shape comparison methods, among which three are based on machine learning, to distinguish between 588 multi-domain proteins and to recreate the evolutionary relationships at the protein and species levels of the SCOPe database. The six groups that participated in the challenge submitted a total of 15 sets of results. We observed that the performance of all the methods significantly decreases at the species level, suggesting that shape-only protein comparison is challenging for closely related proteins. Even if the dataset is limited in size (only 588 proteins are considered whereas more than 160,000 protein structures are experimentally solved), we think that this work provides useful insights into the current shape comparison methods performance, and highlights possible limitations to large-scale applications due to the computational cost. (C) 2020 The Author(s). Published by Elsevier Ltd.
- Published
- 2020
- Full Text
- View/download PDF
34. Performance and enhancement of the LZerD protein assembly pipeline in CAPRI 38-46
- Author
-
Charles Christoffer, Daisuke Kihara, Jacob Verburgt, Lenna X. Peterson, Woong-Hee Shin, Genki Terashi, Sai Raghavendra Maddhuri Venkata Subramaniya, and Tunde Aderinwale
- Subjects
Protein Conformation, alpha-Helical ,Computer science ,computer.software_genre ,Ligands ,Biochemistry ,Article ,03 medical and health sciences ,Critical Assessment of Prediction of Interactions ,Structural Biology ,Protein Interaction Mapping ,Humans ,Human group ,Macromolecular docking ,Protein Interaction Domains and Motifs ,Amino Acid Sequence ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Binding Sites ,030302 biochemistry & molecular biology ,Protein pair ,Proteins ,Protein structure prediction ,Molecular Docking Simulation ,Template ,Docking (molecular) ,Research Design ,Structural Homology, Protein ,Protein Conformation, beta-Strand ,Data mining ,Peptides ,computer ,Algorithms ,Software ,Protein Binding - Abstract
We report the performance of the protein docking prediction pipeline of our group and the results for Critical Assessment of Prediction of Interactions (CAPRI) rounds 38-46. The pipeline integrates programs developed in our group as well as other existing scoring functions. The core of the pipeline is the LZerD protein-protein docking algorithm. If templates of the target complex are not found in PDB, the first step of our docking prediction pipeline is to run LZerD for a query protein pair. Meanwhile, in the case of human group prediction, we survey the literature to find information that can guide the modeling, such as protein-protein interface information. In addition to any literature information and binding residue prediction, generated docking decoys were selected by a rank aggregation of statistical scoring functions. The top 10 decoys were relaxed by a short molecular dynamics simulation before submission to remove atom clashes and improve side-chain conformations. In these CAPRI rounds, our group, particularly the LZerD server, showed robust performance. On the other hand, there are failed cases where some other groups were successful. To understand weaknesses of our pipeline, we analyzed sources of errors for failed targets. Since we noted that structure refinement is a step that needs improvement, we newly performed a comparative study of several refinement approaches. Finally, we show several examples that illustrate successful and unsuccessful cases by our group.
- Published
- 2019
35. CryoFold: determining protein structures and ensembles from cryo-EM data
- Author
-
Gaspard Debussche, Daisuke Kihara, Ken A. Dill, Mrinal Shekhar, Jonathan Nguyen, Nicholas J. Sisco, Genki Terashi, Petra Fromme, Alberto Perez, Daipayan Sarkar, Chitrak Gupta, James Zook, Arup Mondal, Wade D. Van Horn, Emad Tajkhorshid, Abhishek Singharoy, and John Vant
- Subjects
Physics ,0303 health sciences ,03 medical and health sciences ,Sequence ,Protein structure ,010304 chemical physics ,Cryo-electron microscopy ,Computation ,0103 physical sciences ,Resolution (electron density) ,Biological system ,01 natural sciences ,030304 developmental biology - Abstract
Cryo-EM is a powerful method for determining protein structures. But it requires computational assistance. Physics-based computations have the power to give low-free-energy structures and ensembles of populations, but have been computationally limited to only small soluble proteins. Here, we introduce CryoFold. By integrating data of varying sparsity from electron density maps of 3–5 Å resolution with coarse-grained physical knowledge of secondary and tertiary interactions, CryoFold determines ensembles of protein structures directly from sequence. We give six examples showing its broad capabilities, over proteins ranging from 72 to 2000 residues, including membrane and multi-domain proteins, and including results from two EMDB competitions. The ensembles CryoFold predicts starting from the density data of a single known protein conformation encompass multiple low-energy conformations, all of which are experimentally validated and biologically relevant.
- Published
- 2019
- Full Text
- View/download PDF
36. Protein docking model evaluation by 3D deep convolutional neural networks
- Author
-
Genki Terashi, Mengmeng Zhu, Daisuke Kihara, Xiao Wang, and Charles Christoffer
- Subjects
Statistics and Probability ,Computer science ,computer.software_genre ,Machine learning ,Biochemistry ,Convolutional neural network ,03 medical and health sciences ,Voxel ,Macromolecular docking ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Artificial neural network ,business.industry ,Deep learning ,030302 biochemistry & molecular biology ,A protein ,Proteins ,Original Papers ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Docking (molecular) ,Artificial intelligence ,Neural Networks, Computer ,business ,computer - Abstract
Motivation Many important cellular processes involve physical interactions of proteins. Therefore, determining protein quaternary structures provide critical insights for understanding molecular mechanisms of functions of the complexes. To complement experimental methods, many computational methods have been developed to predict structures of protein complexes. One of the challenges in computational protein complex structure prediction is to identify near-native models from a large pool of generated models. Results We developed a convolutional deep neural network-based approach named DOcking decoy selection with Voxel-based deep neural nEtwork (DOVE) for evaluating protein docking models. To evaluate a protein docking model, DOVE scans the protein–protein interface of the model with a 3D voxel and considers atomic interaction types and their energetic contributions as input features applied to the neural network. The deep learning models were trained and validated on docking models available in the ZDock and DockGround databases. Among the different combinations of features tested, almost all outperformed existing scoring functions. Availability and implementation Codes available at http://github.com/kiharalab/DOVE, http://kiharalab.org/dove/. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2019
37. MAINMAST: de novo protein structure modeling for cryo-EM maps assisted by structure feature detection by deep learning
- Author
-
Genki Terashi, Xiao Wang, and Daisuke Kihara
- Subjects
business.industry ,Computer science ,Cryo-electron microscopy ,Deep learning ,Pattern recognition ,Condensed Matter Physics ,Biochemistry ,Inorganic Chemistry ,Structural Biology ,General Materials Science ,Artificial intelligence ,Physical and Theoretical Chemistry ,business ,Protein structure modeling ,Feature detection (computer vision) - Published
- 2021
- Full Text
- View/download PDF
38. Emap2sec+: detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning
- Author
-
Xiao Wang, Tunde Aderinwale, Genki Terashi, Eman Alnabati, Sai Raghavendra Maddhuri Venkata Subramaniya, and Daisuke Kihara
- Subjects
Materials science ,business.industry ,Cryo-electron microscopy ,Deep learning ,Resolution (electron density) ,Condensed Matter Physics ,Biochemistry ,Inorganic Chemistry ,Structural Biology ,Biophysics ,General Materials Science ,Artificial intelligence ,Physical and Theoretical Chemistry ,business - Published
- 2021
- Full Text
- View/download PDF
39. Vesper: Global and Local Cryo-Em Map Alignment and Database Search using Local Density Vectors
- Author
-
Genki Terashi, Siyang Chen, Daisuke Kihara, Charles Christoffer, and Xusi Han
- Subjects
Cryo-electron microscopy ,business.industry ,Computer science ,Biophysics ,Pattern recognition ,Database search engine ,Artificial intelligence ,business - Published
- 2021
- Full Text
- View/download PDF
40. Super Resolution Cryo-EM Maps with 3D Deep Generative Networks
- Author
-
Genki Terashi, Sai Raghavendra Maddhuri Venkata Subramaniya, and Daisuke Kihara
- Subjects
0303 health sciences ,Computer science ,Cryo-electron microscopy ,business.industry ,Deep learning ,Resolution (electron density) ,Biophysics ,Resolution improvement ,Superresolution ,03 medical and health sciences ,Range (mathematics) ,0302 clinical medicine ,Artificial intelligence ,Angstrom ,business ,Algorithm ,030217 neurology & neurosurgery ,Generative grammar ,030304 developmental biology - Abstract
An increasing number of biological macromolecules have been solved with cryo-electron microscopy (cryo-EM). Over the past few years, the resolutions of density maps determined by cryo-EM have largely improved in general. However, there are still many cases where the resolution is not high enough to model molecular structures with standard computational tools. If the resolution obtained is near the empirical borderline (3-4 Angstroms), a small improvement of resolution will significantly facilitate structure modeling. Here, we report SuperEM, a novel deep learning-based method that uses a three-dimensional generative adversarial network for generating an improved-resolution EM map from an experimental EM map. SuperEM is designed to work with EM maps in the resolution range of 3 Angstroms to 6 Angstroms and has shown an average resolution improvement of 1.0 Angstrom on a test dataset of 36 experimental maps. The generated super-resolution maps are shown to result in better structure modelling of proteins.
- Published
- 2021
- Full Text
- View/download PDF
41. Human and server docking prediction for CAPRI round 30‐35 using LZerD with combined scoring functions
- Author
-
A. Roy, Lenna X. Peterson, Jian Zhang, Daisuke Kihara, Juan Esquivel-Rodríguez, Woong-Hee Shin, Xusi Han, Hyungrae Kim, Matthew R. Lee, and Genki Terashi
- Subjects
0301 basic medicine ,Protein Conformation ,Computer science ,Amino Acid Motifs ,computer.software_genre ,Biochemistry ,Article ,03 medical and health sciences ,Scoring functions for docking ,Critical Assessment of Prediction of Interactions ,Structural Biology ,Protein Interaction Mapping ,Humans ,Molecular Biology ,Lead Finder ,Binding Sites ,030102 biochemistry & molecular biology ,Computational Biology ,Proteins ,Water ,Protein structure prediction ,Molecular Docking Simulation ,Benchmarking ,030104 developmental biology ,Protein–ligand docking ,Ranking ,Research Design ,Structural Homology, Protein ,Docking (molecular) ,Thermodynamics ,Data mining ,Protein Multimerization ,Surface protein ,computer ,Algorithms ,Software ,Protein Binding - Abstract
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc.
- Published
- 2016
- Full Text
- View/download PDF
42. De Novo Computational Protein Tertiary Structure Modeling Pipeline for Cryo-EM Maps of Intermediate Resolution
- Author
-
Sai Raghavendra Maddhuri Venkata Subramaniya, Genki Terashi, and Daisuke Kihara
- Subjects
Cryo-electron microscopy ,Pipeline (computing) ,Resolution (electron density) ,Biophysics ,Protein tertiary structure ,Geology ,Computational science - Published
- 2020
- Full Text
- View/download PDF
43. De Novo Protein Structure Modeling Tool MAINMAST Enhanced for Multiple Chain Complexes and Bound Ligands
- Author
-
Genki Terashi and Daisuke Kihara
- Subjects
Chain (algebraic topology) ,Chemistry ,Stereochemistry ,Biophysics ,Protein structure modeling - Published
- 2020
- Full Text
- View/download PDF
44. Protein Secondary Structure Detection in Intermediate-Resolution Cryo-EM Maps using Deep Learning
- Author
-
Sai Raghavendra Maddhuri Venkata Subramaniya, Genki Terashi, and Daisuke Kihara
- Subjects
Models, Molecular ,Cryo-electron microscopy ,Computer science ,Biophysics ,Biochemistry ,Convolutional neural network ,Article ,Protein Structure, Secondary ,03 medical and health sciences ,Software ,Deep Learning ,0302 clinical medicine ,Humans ,Point (geometry) ,Molecular Biology ,Protein secondary structure ,030304 developmental biology ,0303 health sciences ,Artificial neural network ,business.industry ,Deep learning ,Cryoelectron Microscopy ,Resolution (electron density) ,Proteins ,Pattern recognition ,Cell Biology ,Grid ,Neural Networks, Computer ,Artificial intelligence ,business ,030217 neurology & neurosurgery ,Biotechnology - Abstract
Although structures determined at near-atomic resolution are now routinely reported by cryo-electron microscopy (cryo-EM), many density maps are determined at an intermediate resolution, and extracting structure information from these maps is still a challenge. We report a computational method, Emap2sec, that identifies the secondary structures of proteins (α-helices, β-sheets and other structures) in EM maps at resolutions of between 5 and 10 A. Emap2sec uses a three-dimensional deep convolutional neural network to assign secondary structure to each grid point in an EM map. We tested Emap2sec on EM maps simulated from 34 structures at resolutions of 6.0 and 10.0 A, as well as on 43 maps determined experimentally at resolutions of between 5.0 and 9.5 A. Emap2sec was able to clearly identify the secondary structures in many maps tested, and showed substantially better performance than existing methods. Emap2sec uses a deep convolutional neural network to assign protein secondary structures in intermediate-resolution (5–10 A) cryo-EM maps.
- Published
- 2020
- Full Text
- View/download PDF
45. MAINMAST-MELD-MDFF: Denovo Structure-Determination with Data-Guided Molecular Dynamics
- Author
-
Mrinal Shekhar, Emad Tajkhorshid, Ken A. Dill, Daisuke Kihara, Genki Terashi, Abhishek Singharoy, and Alberto Perez
- Subjects
Molecular dynamics ,Chemistry ,Chemical physics ,Biophysics ,Structure (category theory) - Published
- 2019
- Full Text
- View/download PDF
46. Modeling the assembly order of multimeric heteroprotein complexes
- Author
-
Genki Terashi, Juan Esquivel-Rodríguez, Lenna X. Peterson, A. Roy, Woong-Hee Shin, Charles Christoffer, Yoichiro Togawa, and Daisuke Kihara
- Subjects
0301 basic medicine ,Computer science ,Complex formation ,Protein Structure Prediction ,Biochemistry ,Molecular Docking Simulation ,0302 clinical medicine ,Computational Chemistry ,Protein structure ,Protein Interaction Mapping ,Macromolecular Structure Analysis ,Databases, Protein ,lcsh:QH301-705.5 ,Free Energy ,0303 health sciences ,Ecology ,Physics ,Protein structure prediction ,Chemistry ,Order (biology) ,Computational Theory and Mathematics ,Modeling and Simulation ,Physical Sciences ,Molecular Mechanics ,Thermodynamics ,Experimental methods ,Protein Structure Determination ,Biological system ,Algorithms ,Protein Binding ,Research Article ,Cholera Toxin ,Protein Structure ,Multiprotein complex ,Chemical physics ,Protein subunit ,Protein domain ,Biophysics ,Protein–protein interaction ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Protein Domains ,Genetics ,Humans ,Protein Interactions ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Models, Statistical ,Helicobacter pylori ,Computational Biology ,Proteins ,Biology and Life Sciences ,Protein Complexes ,Dimers (Chemical physics) ,030104 developmental biology ,lcsh:Biology (General) ,Protein-Protein Interactions ,Docking (molecular) ,Path (graph theory) ,Protein structure modeling ,030217 neurology & neurosurgery ,Software - Abstract
Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes., Author summary Protein-protein interactions, particularly those involving multiple proteins, are the cornerstone of numerous biological processes. Although an increasing number of multi-chain protein complex structures have been determined, fewer studies have been performed to determine the assembly order of complexes. Knowing the assembly order of a complex provides insights into the process of complex formation. Assembly order is also practically useful for reconstructing and determining the structure of a subcomplex of a large protein complex. It also has important applications including designing artificial protein complexes and drugs that prevent the assembly of protein complexes. We present a computational method, Path-LZerD, which predicts the assembly order of a protein complex by simulating its assembly process. This is the first method of this kind. A strong advantage of Path-LZerD is that the assembly order can be predicted even when the overall complex structure is not known. Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.
- Published
- 2018
47. Protein structure model refinement in CASP12 using short and long molecular dynamics simulations in implicit solvent
- Author
-
Genki Terashi and Daisuke Kihara
- Subjects
0301 basic medicine ,Models, Molecular ,Computer science ,Protein Conformation ,Structure (category theory) ,Molecular Dynamics Simulation ,Crystallography, X-Ray ,Biochemistry ,Article ,03 medical and health sciences ,Molecular dynamics ,Protein structure ,Structural Biology ,Sequence Analysis, Protein ,Humans ,CASP ,Molecular Biology ,Protein secondary structure ,Topology (chemistry) ,Computational Biology ,Proteins ,Protein structure prediction ,Crystallography ,030104 developmental biology ,Template ,Solvents ,Algorithm ,Algorithms - Abstract
Protein structure prediction has matured over years, particularly those which uses structure templates for building a model. It can build a model with correct overall conformation in cases where appropriate templates are available. Models with the correct topology can be practically useful for limited purposes that need residue-level accuracy, but further improvement of the models can allow the models to be used in tasks that need detailed structures, such as molecular replacement in X-ray crystallography or structure-based drug screening. Thus, model refinement is an important final step in protein structure prediction to bridge predictions to real-life applications. Model refinement is one of the categories in recent rounds of Critical Assessment of techniques in protein Structure Prediction (CASP) and has recently been drawing more attention due to its realized importance. Here we report our group’s performance in the refinement category in CASP12. Our method is based on inexpensive short molecular dynamics (MD) simulations in implicit solvent. Our performance in CASP12 was among the top, which was consistent with the previous round, CASP11. Our method with short MD runs achieved comparable performance with other methods that used longer simulations. Detailed analyses found that improvements typically occurred in entire regions of a structure rather than only in flexible loop regions. The remaining challenge in the structure refinement includes large conformational refinement which involves substantial motions of secondary structure elements or domains.
- Published
- 2017
48. Modeling disordered protein interactions from biophysical principles
- Author
-
A. Roy, Genki Terashi, Daisuke Kihara, Lenna X. Peterson, and Charles Christoffer
- Subjects
Proteomics ,Models, Molecular ,0301 basic medicine ,Protein Structure Prediction ,Molecular Dynamics ,Biochemistry ,Database and Informatics Methods ,Mice ,Computational Chemistry ,Sequence Analysis, Protein ,Macromolecular Structure Analysis ,lcsh:QH301-705.5 ,Ecology ,Proteomic Databases ,Chemistry ,Physics ,Computational Theory and Mathematics ,Modeling and Simulation ,Physical Sciences ,Structural Proteins ,Protein Structure Determination ,Experimental methods ,Research Article ,Protein Binding ,Protein Structure ,Biophysics ,Computational biology ,Research and Analysis Methods ,Protein–protein interaction ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Genetics ,Animals ,Humans ,Amino Acid Sequence ,Binding site ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Binding Sites ,Biology and Life Sciences ,Proteins ,Computational Biology ,Protein tertiary structure ,Intrinsically Disordered Proteins ,Biological Databases ,030104 developmental biology ,lcsh:Biology (General) ,Docking (molecular) ,Protein structure modeling - Abstract
Disordered protein-protein interactions (PPIs), those involving a folded protein and an intrinsically disordered protein (IDP), are prevalent in the cell, including important signaling and regulatory pathways. IDPs do not adopt a single dominant structure in isolation but often become ordered upon binding. To aid understanding of the molecular mechanisms of disordered PPIs, it is crucial to obtain the tertiary structure of the PPIs. However, experimental methods have difficulty in solving disordered PPIs and existing protein-protein and protein-peptide docking methods are not able to model them. Here we present a novel computational method, IDP-LZerD, which models the conformation of a disordered PPI by considering the biophysical binding mechanism of an IDP to a structured protein, whereby a local segment of the IDP initiates the interaction and subsequently the remaining IDP regions explore and coalesce around the initial binding site. On a dataset of 22 disordered PPIs with IDPs up to 69 amino acids, successful predictions were made for 21 bound and 18 unbound receptors. The successful modeling provides additional support for biophysical principles. Moreover, the new technique significantly expands the capability of protein structure modeling and provides crucial insights into the molecular mechanisms of disordered PPIs., Author summary A substantial fraction of the proteins encoded in genomes are intrinsically disordered proteins (IDPs), which lack a single stable structure in the native state. IDPs serve many functions including mediating protein-protein interactions (PPIs). Such disordered PPIs are prevalent in important regulatory pathways, including many interactions of the tumor suppressor protein p53. To elucidate the molecular mechanisms of disordered PPIs, obtaining tertiary structure information is essential; however, they are difficult to study with experimental techniques and existing computational protein-protein and protein-peptide modeling methods are unable to model disordered PPIs. Here we present a novel computational method for modeling the structure of disordered PPIs, which is the first of this sort. The method, IDP-LZerD, is designed to follow a known biophysical picture of the mechanism of how IDPs interact with structured proteins. IDP-LZerD successfully modeled the majority of disordered PPIs tested. This technique opens up new possibilities for structural studies of IDPs and their interactions.
- Published
- 2017
49. Improved De Novo Main-Chain Tracing Method Mainmast for Multi-Chain Modeling, Local Refinement, and Graphical User Interface
- Author
-
Genki Terashi, Yuhong Zha, and Daisuke Kihara
- Subjects
Chain (algebraic topology) ,business.industry ,Computer science ,Biophysics ,Tracing ,business ,Graphical user interface ,Computational science - Published
- 2019
- Full Text
- View/download PDF
50. Community-wide evaluation of methods for predicting the effect of mutations on protein-protein interactions
- Author
-
Howook Hwang, Shiyong Liu, Xiaoqin Zou, Huan-Xiang Zhou, Hideaki Umeyama, Paul A. Bates, Hahnbeom Park, Yangyu Huang, Xiaolei Zhu, Marianne Rooman, Rudi Agius, David Baker, Sarel J. Fleishman, Dimitri Gillis, Eiji Kanamori, Yuko Tsuchiya, Sandor Vajda, Panagiotis L. Kastritis, Brian Jimenez, Thom Vreven, Xiufeng Yang, Hiromitsu Shimoyama, Nan Zhao, Zhiping Weng, Sheng-You Huang, Mikael Trellet, Chaok Seok, Samuel C. Flores, Miguel Romero-Durana, Sanbo Qin, Michael S. Pacella, Julie C. Mitchell, Mayuko Takeda-Shitaka, Dmitri Beglov, Jeffrey J. Gray, Shoshana J. Wodak, Rocco Moretti, Martin Zacharias, Dmitry Korkin, Dima Kozakov, João P. G. L. M. Rodrigues, Haruki Nakamura, Juan Esquivel-Rodríguez, Mieczyslaw Torchala, Yves Dehouck, Alexandre M. J. J. Bonvin, David R. Hall, Mitsuo Iwadate, Krishna Praneeth Kilambi, Jamica Sarmiento, Daron M. Standley, Joël Janin, Omar N. A. Demerdash, Brian G. Pierce, Chiara Pallara, Meng Cui, Shusuke Teraguchi, Petr Popov, Hasup Lee, Haotian Li, Juan Fernández-Recio, Laura Pérez-Cano, Sergei Grudinin, Sameer Velankar, Daisuke Kihara, Xiaofeng Ji, Genki Terashi, Yi Xiao, Shide Liang, and Iain H. Moal
- Subjects
Genetics ,0303 health sciences ,Mutation ,010304 chemical physics ,Fitness landscape ,Stability (learning theory) ,Computational biology ,Yeast display ,Biology ,medicine.disease_cause ,01 natural sciences ,Biochemistry ,Deep sequencing ,Protein–protein interaction ,03 medical and health sciences ,Structural Biology ,0103 physical sciences ,medicine ,CASP ,Saturated mutagenesis ,Molecular Biology ,030304 developmental biology - Abstract
Community-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing. The most successful methods explicitly considered the effects of mutation on monomer stability in addition to binding affinity, carried out explicit side-chain sampling and backbone relaxation, evaluated packing, electrostatic, and solvation effects, and correctly identified around a third of the beneficial mutations. Much room for improvement remains for even the best techniques, and large-scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.