57 results on '"Hahnbeom Park"'
Search Results
2. Evaluating GPCR modeling and docking strategies in the era of deep learning-based protein structure prediction
- Author
-
Sumin Lee, Seeun Kim, Gyu Rie Lee, Sohee Kwon, Hyeonuk Woo, Chaok Seok, and Hahnbeom Park
- Subjects
Structural Biology ,Genetics ,Biophysics ,Biochemistry ,Computer Science Applications ,Biotechnology - Abstract
While deep learning (DL) has brought a revolution in the protein structure prediction field, still an important question remains how the revolution can be transferred to advances in structure-based drug discovery. Because the lessons from the recent GPCR dock challenge were inconclusive primarily due to the size of the dataset, in this work we further elaborated on 70 diverse GPCR complexes bound to either small molecules or peptides to investigate the best-practice modeling and docking strategies for GPCR drug discovery. From our quantitative analysis, it is shown that substantial improvements in docking and virtual screening have been possible by the advance in DL-based protein structure predictions with respect to the expected results from the combination of best pre-DL tools. The success rate of docking on DL-based model structures approaches that of cross-docking on experimental structures, showing over 30% improvement from the best pre-DL protocols. This amount of performance could be achieved only when two modeling points were considered properly: 1) correct functional-state modeling of receptors and 2) receptor-flexible docking. Best-practice modeling strategies and the model confidence estimation metric suggested in this work may serve as a guideline for future computer-aided GPCR drug discovery scenarios.
- Published
- 2023
- Full Text
- View/download PDF
3. Pseudo-Isolated α-Helix Platform for the Recognition of Deep and Narrow Targets
- Author
-
Dong-in Kim, So-hee Han, Hahnbeom Park, Sehwan Choi, Mandeep Kaur, Euimin Hwang, Seong-jae Han, Jung-yeon Ryu, Hae-Kap Cheong, Ravi Pratap Barnwal, and Yong-beom Lim
- Subjects
Protein Conformation, alpha-Helical ,Protein Folding ,Colloid and Surface Chemistry ,Circular Dichroism ,Humans ,Proteins ,Amino Acid Sequence ,General Chemistry ,Peptides ,Biochemistry ,Protein Structure, Secondary ,Catalysis - Abstract
Although interest in stabilized α-helical peptides as next-generation therapeutics for modulating biomolecular interfaces is increasing, peptides have limited functionality and stability due to their small size. In comparison, α-helical ligands based on proteins can make steric clash with targets due to their large size. Here, we report the design of a monomeric pseudo-isolated α-helix (mPIH) system in which proteins behave as if they are peptides. The designed proteins contain α-helix ligands that do not require any covalent chemical modification, do not have frayed ends, and importantly can make sterically favorable interactions similar to isolated peptides. An optimal mPIH showed a more than 100-fold increase in target selectivity, which might be related to the advantages in conformational selection due to the absence of frayed ends. The α-helical ligand in the mPIH displayed high thermal stability well above human body temperature and showed reversible and rapid folding/unfolding transitions. Thus, mPIH can become a promising protein-based platform for developing stabilized α-helix pharmaceuticals.
- Published
- 2022
- Full Text
- View/download PDF
4. Accurate protein structure prediction: what comes next?
- Author
-
Hahnbeom Park, Chaok Seok, Minkyung Baek, Jonghun Won, Gyu Rie Lee, and Martin Steinegger
- Subjects
Computer science ,Protein structure prediction ,Algorithm - Published
- 2021
- Full Text
- View/download PDF
5. Protein oligomer modeling guided by predicted interchain contacts in <scp>CASP14</scp>
- Author
-
David Baker, Ivan Anishchenko, Ian R. Humphreys, Minkyung Baek, and Hahnbeom Park
- Subjects
Models, Molecular ,Materials science ,Protein Conformation ,Ab initio ,Computational Biology ,Proteins ,Biochemistry ,Oligomer ,Protein Subunits ,chemistry.chemical_compound ,Crystallography ,Deep Learning ,chemistry ,Sequence Analysis, Protein ,Structural Biology ,Docking (molecular) ,Structure generation ,Databases, Protein ,Molecular Biology ,Software ,Protein Binding - Abstract
For CASP14, we developed deep learning-based methods for predicting homo-oligomeric and hetero-oligomeric contacts and used them for oligomer modeling. To build structure models, we developed an oligomer structure generation method that utilizes predicted interchain contacts to guide iterative restrained minimization from random backbone structures. We supplemented this gradient-based fold-and-dock method with template-based and ab initio docking approaches using deep learning-based subunit predictions on 29 assembly targets. These methods produced oligomer models with summed Z-scores 5.5 units higher than the next best group, with the fold-and-dock method having the best relative performance. Over the eight targets for which this method was used, the best of the five submitted models had average oligomer TM-score of 0.71 (average oligomer TM-score of the next best group: 0.64), and explicit modeling of inter-subunit interactions improved modeling of six out of 40 individual domains (ΔGDT-TS > 2.0).
- Published
- 2021
- Full Text
- View/download PDF
6. Accurate prediction of protein structures and interactions using a three-track neural network
- Author
-
Jose Henrique Pereira, Ana C. Ebrecht, Lisa N. Kinch, R. Dustin Schaeffer, Ivan Anishchenko, Justas Dauparas, Udit Dalwadi, Gyu Rie Lee, Christoph Buhlheller, Diederik J. Opperman, David Baker, Tea Pavkov-Keller, Qian Cong, Caleb R. Glassman, Alberdina A. van Dijk, Jue Wang, Andria V. Rodrigues, Theo Sagmeister, Randy J. Read, Andy DeGiovanni, Hahnbeom Park, Paul D. Adams, Calvin K. Yip, Frank DiMaio, John E. Burke, Claudia Millán, K. Christopher Garcia, Carson Adams, Minkyung Baek, Nick V. Grishin, Sergey Ovchinnikov, and Manoj K. Rathinaswamy
- Subjects
Structure (mathematical logic) ,0303 health sciences ,Sequence ,Network architecture ,Multidisciplinary ,Artificial neural network ,business.industry ,Computer science ,Deep learning ,computer.software_genre ,Modeling and simulation ,03 medical and health sciences ,Structural bioinformatics ,0302 clinical medicine ,Data mining ,Artificial intelligence ,business ,Distance transform ,computer ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.
- Published
- 2021
- Full Text
- View/download PDF
7. Protein tertiary structure prediction and refinement using deep learning and Rosetta in <scp>CASP14</scp>
- Author
-
Hahnbeom Park, David Baker, David E. Kim, Naozumi Hiranuma, Ivan Anishchenko, Ian R. Humphreys, Justas Dauparas, Minkyung Baek, and Sanaa Mansoor
- Subjects
Similarity (geometry) ,Computer science ,Orientation (computer vision) ,business.industry ,Deep learning ,Pipeline (computing) ,Computational Biology ,Proteins ,Protein structure prediction ,Biochemistry ,Protein Structure, Tertiary ,Deep Learning ,Sequence Analysis, Protein ,Structural Biology ,Benchmark (computing) ,Humans ,Metagenome ,Artificial intelligence ,Language model ,business ,CASP ,Molecular Biology ,Algorithm ,Software - Abstract
The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.
- Published
- 2021
- Full Text
- View/download PDF
8. Divergent acyl carrier protein decouples mitochondrial Fe-S cluster biogenesis from fatty acid synthesis in malaria parasites
- Author
-
Paul A. Sigala, Jaime Sepulveda, Seyi Falekun, James A. Wohlschlegel, Hahnbeom Park, and Yasaman Jami-Alahmadi
- Subjects
Fe-S cluster synthesis ,Protozoan Proteins ,acyl carrier protein ,Mitochondrion ,chemistry.chemical_compound ,falciparum ,2.1 Biological and endogenous factors ,2.2 Factors relating to the physical environment ,Biology (General) ,Aetiology ,Microbiology and Infectious Disease ,Organelle Biogenesis ,biology ,General Neuroscience ,Fatty Acids ,General Medicine ,Cell biology ,mitochondria ,Acyl carrier protein ,Infectious Diseases ,Medicine ,Infection ,Research Article ,QH301-705.5 ,Science ,Iron ,infectious disease ,Plasmodium falciparum ,malaria ,chemical biology ,P. falciparum ,General Biochemistry, Genetics and Molecular Biology ,Rare Diseases ,Biosynthesis ,Biochemistry and Chemical Biology ,parasitic diseases ,biochemistry ,Fatty acid synthesis ,General Immunology and Microbiology ,microbiology ,biology.organism_classification ,Vector-Borne Diseases ,Good Health and Well Being ,chemistry ,Coenzyme Q – cytochrome c reductase ,biology.protein ,organelle adaptation ,Biochemistry and Cell Biology ,Function (biology) ,Biogenesis ,Sulfur - Abstract
Plasmodium falciparummalaria parasites are early-diverging eukaryotes with many unusual metabolic adaptations. Understanding these adaptations will give insight into parasite evolution and unveil new parasite-specific drug targets. Most eukaryotic cells retain a mitochondrial fatty acid synthesis (FASII) pathway whose acyl carrier protein (mACP) and 4-phosphopantetheine (Ppant) prosthetic group provide a soluble scaffold for acyl chain synthesis. In yeast and humans, mACP also functions to biochemically couple FASII activity to electron transport chain (ETC) assembly and Fe-S cluster biogenesis. In contrast to most eukaryotes, thePlasmodiummitochondrion lacks FASII enzymes yet curiously retains a divergent mACP lacking a Ppant group. We report that ligand-dependent knockdown of mACP is lethal to parasites, indicating an essential FASII-independent function. Decyl-ubiquinone rescues parasites temporarily from death, suggesting a dominant dysfunction of the mitochondrial ETC followed by broader cellular defects. Biochemical studies reveal thatPlasmodiummACP binds and stabilizes the Isd11-Nfs1 complex required for Fe-S cluster biosynthesis, despite lacking the Ppant group required for this association in other eukaryotes, and knockdown of parasite mACP causes loss of both Nfs1 and the Rieske Fe-S protein in ETC Complex III. This work reveals thatPlasmodiumparasites have evolved to decouple mitochondrial Fe-S cluster biogenesis from FASII activity, and this adaptation is a shared metabolic feature of otherApicomplexanpathogens, includingToxoplasmaandBabesia. This discovery also highlights the ancient, fundamental role of ACP in mitochondrial Fe-S cluster biogenesis and unveils an evolutionary driving force to retain this interaction with ACP independent of its eponymous function in FASII.Significance StatementPlasmodiummalaria parasites are single-celled eukaryotes that evolved unusual metabolic adaptations. Parasites require a mitochondrion for blood-stage viability, but essential functions beyond the electron transport chain are sparsely understood. Unlike yeast and human cells, thePlasmodiummitochondrion lacks fatty acid synthesis enzymes but retains a divergent acyl carrier protein (mACP) incapable of tethering acyl groups. Nevertheless, mACP is essential for parasite viability by binding and stabilizing the core mitochondrial Fe-S cluster biogenesis complex via a divergent molecular interface lacking an acyl-pantetheine group that contrasts with other eukaryotes. This discovery unveils an essential metabolic adaptation inPlasmodiumand other human parasites that decouples mitochondrial Fe-S cluster biogenesis from fatty acid synthesis and evolved at or near the emergence ofApicomplexanparasitism.
- Published
- 2021
9. Author response: Divergent acyl carrier protein decouples mitochondrial Fe-S cluster biogenesis from fatty acid synthesis in malaria parasites
- Author
-
Paul A. Sigala, Yasaman Jami-Alahmadi, Hahnbeom Park, Jaime Sepulveda, James A. Wohlschlegel, and Seyi Falekun
- Subjects
Acyl carrier protein ,chemistry.chemical_compound ,biology ,Biochemistry ,Chemistry ,biology.protein ,medicine ,Cluster (physics) ,medicine.disease ,Malaria ,Biogenesis ,Fatty acid synthesis - Published
- 2021
- Full Text
- View/download PDF
10. High‐accuracy refinement using Rosetta in CASP13
- Author
-
Gyu Rie Lee, Qian Cong, Hahnbeom Park, Ivan Anishchenko, David E. Kim, and David Baker
- Subjects
Models, Molecular ,Protein Folding ,Fold (higher-order function) ,Protein Conformation ,Computer science ,media_common.quotation_subject ,Biochemistry ,Article ,03 medical and health sciences ,Structural Biology ,Energy level ,Search problem ,Quality (business) ,Molecular Biology ,030304 developmental biology ,media_common ,Structure (mathematical logic) ,0303 health sciences ,030302 biochemistry & molecular biology ,Computational Biology ,Proteins ,Reproducibility of Results ,Function (mathematics) ,Protein structure prediction ,Thermodynamics ,Algorithm ,Algorithms ,Energy (signal processing) - Abstract
Because proteins generally fold to their lowest free energy states, energy-guided refinement in principle should be able to systematically improve the quality of protein structure models generated using homologous structure or co-evolution derived information. However, because of the high dimensionality of the search space, there are far more ways to degrade the quality of a near native model than to improve it, and hence, refinement methods are very sensitive to energy function errors. In the 13th Critial Assessment of techniques for protein Structure Prediction (CASP13), we sought to carry out a thorough search for low energy states in the neighborhood of a starting model using restraints to avoid straying too far. The approach was reasonably successful in improving both regions largely incorrect in the starting models as well as core regions that started out closer to the correct structure. Models with GDT-HA over 70 were obtained for five targets and for one of those, an accuracy of 0.5 å backbone root-mean-square deviation (RMSD) was achieved. An important current challenge is to improve performance in refining oligomers and larger proteins, for which the search problem remains extremely difficult.
- Published
- 2019
- Full Text
- View/download PDF
11. Author response for 'Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14'
- Author
-
Justas Dauparas, David Baker, David E. Kim, Minkyung Baek, Ian R. Humphreys, Naozumi Hiranuma, Ivan Anishchenko, Hahnbeom Park, and Sanaa Mansoor
- Subjects
business.industry ,Computer science ,Deep learning ,Artificial intelligence ,business ,computer.software_genre ,computer ,Protein tertiary structure ,Natural language processing - Published
- 2021
- Full Text
- View/download PDF
12. Author response for 'Protein oligomer modeling guided by predicted interchain contacts in CASP14'
- Author
-
David Baker, Hahnbeom Park, Ivan Anishchenko, Minkyung Baek, and Ian R. Humphreys
- Subjects
Crystallography ,chemistry.chemical_compound ,Chemistry ,Oligomer - Published
- 2021
- Full Text
- View/download PDF
13. Accurate prediction of protein structures and interactions using a 3-track network
- Author
-
Nick V. Grishin, Minkyung Baek, Udit Dalwadi, Gyu Rie Lee, Hahnbeom Park, Carson Adams, van Dijk Aa, Manoj K. Rathinaswamy, Theo Sagmeister, Qian Cong, Frank DiMaio, Randy J. Read, David Baker, Paul D. Adams, Sergey Ovchinnikov, Buhlheller C, Calvin K. Yip, Caleb R. Glassman, Ivan Anishchenko, Schaeffer Rd, Claudia Millán, Diederik J. Opperman, Tea Pavkov-Keller, Jose Henrique Pereira, Ana C. Ebrecht, Lisa N. Kinch, Jing Wang, John E. Burke, Kenan Christopher Garcia, Andria V. Rodrigues, Justas Dauparas, and Andy DeGiovanni
- Subjects
Structure (mathematical logic) ,Network architecture ,Sequence ,Protein structure ,Computer science ,Data mining ,Track (rail transport) ,Protein structure modeling ,computer.software_genre ,computer ,Distance transform - Abstract
DeepMind presented remarkably accurate protein structure predictions at the CASP14 conference. We explored network architectures incorporating related ideas and obtained the best performance with a 3-track network in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The 3-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables rapid solution of challenging X-ray crystallography and cryo-EM structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate models of protein-protein complexes from sequence information alone, short circuiting traditional approaches which require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.One-Sentence SummaryAccurate protein structure modeling enables rapid solution of structure determination problems and provides insights into biological function.
- Published
- 2021
- Full Text
- View/download PDF
14. Force field optimization guided by small molecule crystal lattice data enables consistent sub-Angstrom protein-ligand docking
- Author
-
Hahnbeom Park, Frank DiMaio, David Baker, Guangfeng Zhou, and Minkyung Baek
- Subjects
Physics ,010304 chemical physics ,Drug discovery ,Force field (physics) ,Proteins ,Crystal structure ,Crystallography, X-Ray ,Ligands ,01 natural sciences ,Small molecule ,Chemical space ,Article ,Computer Science Applications ,Molecular Docking Simulation ,Small Molecule Libraries ,Protein–ligand docking ,Chemical physics ,0103 physical sciences ,Molecule ,Angstrom ,Physics::Chemical Physics ,Physical and Theoretical Chemistry ,Algorithms - Abstract
Accurate and rapid calculation of protein-small molecule interaction free energies is critical for computational drug discovery. Because of the large chemical space spanned by drug-like molecules, classical force fields contain thousands of parameters describing atom-pair distance and torsional preferences; each parameter is typically optimized independently on simple representative molecules. Here we describe a new approach in which small molecule force field parameters are jointly optimized guided by the rich source of information contained within thousands of available small molecule crystal structures. We optimize parameters by requiring that the experimentally determined molecular lattice arrangements have lower energy than all alternative lattice arrangements. Thousands of independent crystal lattice-prediction simulations were run on each of 1,386 small molecule crystal structures, and energy function parameters of an implicit solvent energy model were optimized so native crystal lattice arrangements had the lowest energy. The resulting energy model was implemented in Rosetta, together with a rapid genetic algorithm docking method employing grid-based scoring and receptor flexibility. The success rate of bound structure recapitulation in cross-docking on 1,112 complexes was improved by more than 10% over previously published methods, with solutions within
- Published
- 2021
15. Correction to 'The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design'
- Author
-
Rebecca F. Alford, Andrew Leaver-Fay, Jeliazko R. Jeliazkov, Matthew J. O’Meara, Frank P. DiMaio, Hahnbeom Park, Maxim V. Shapovalov, P. Douglas Renfrew, Vikram K. Mulligan, Kalli Kappel, Jason W. Labonte, Michael S. Pacella, Richard Bonneau, Philip Bradley, Roland L. Dunbrack, Rhiju Das, David Baker, Brian Kuhlman, Tanja Kortemme, and Jeffrey J. Gray
- Subjects
Physical and Theoretical Chemistry ,Computer Science Applications - Published
- 2022
- Full Text
- View/download PDF
16. Learning a force field from small-molecule crystal lattice predictions enables consistent sub-Angstrom protein-ligand docking
- Author
-
Frank DiMaio, Hahnbeom Park, David Baker, Guangfeng Zhou, and Minkyung Baek
- Subjects
Physics ,Protein–ligand docking ,Docking (molecular) ,Lattice (order) ,Molecule ,Crystal structure ,Molecular physics ,Small molecule ,Chemical space ,Force field (chemistry) - Abstract
Accurate and rapid calculation of protein-small molecule interaction energies is critical for computational drug discovery. Because of the large chemical space spanned by drug-like molecules, classical force fields contain thousands of parameters describing atom-pair distance and torsional preferences; each parameter is typically optimized independently on simple representative molecules. Here we describe a new approach in which small-molecule force field parameters are jointly optimized guided by the rich source of information contained within thousands of available small molecule crystal structures. We optimize parameters by requiring that the experimentally determined molecular lattice arrangements have lower energy than all alternative lattice arrangements. Thousands of independent crystal lattice-prediction simulations were run on each of 1,386 small molecule crystal structures, and energy function parameters of an implicit solvent energy model were optimized so native crystal lattice arrangements had lowest energy. The resulting energy model was implemented in Rosetta, together with a rapid genetic algorithm docking method employing grid based scoring and receptor flexibility. The success rate of bound structure recapitulation in cross-docking on 1,112 complexes was improved by more than 10% over previously published methods, with solutions within
- Published
- 2020
- Full Text
- View/download PDF
17. Prediction of Protein Mutational Free Energy: Benchmark and Sampling Improvements Increase Classification Accuracy
- Author
-
Frank DiMaio, Steven M. Lewis, Yifan Song, Hahnbeom Park, Brandon Frenz, and Indigo Chris King
- Subjects
0301 basic medicine ,Histology ,protein design and engineering ,Computer science ,lcsh:Biotechnology ,Biomedical Engineering ,Bioengineering ,02 engineering and technology ,computer.software_genre ,03 medical and health sciences ,thermodynamics ,Software ,Protein stability ,lcsh:TP248.13-248.65 ,Methods ,Alanine ,Software suite ,business.industry ,Point mutation ,Experimental data ,Bioengineering and Biotechnology ,mutation free energy ,021001 nanoscience & nanotechnology ,030104 developmental biology ,Data mining ,mutation ,0210 nano-technology ,business ,protein ,computer ,Biotechnology - Abstract
Software to predict the change in protein stability upon point mutation is a valuable tool for a number of biotechnological and scientific problems. To facilitate the development of such software and provide easy access to the available experimental data, the ProTherm database was created. Biases in the methods and types of information collected has led to disparity in the types of mutations for which experimental data is available. For example, mutations to alanine are hugely overrepresented whereas those involving charged residues, especially from one charged residue to another, are underrepresented. ProTherm subsets created as benchmark sets that do not account for this often underrepresented certain mutational types. This issue introduces systematic biases into previously published protocols’ ability to accurately predict the change in folding energy on these classes of mutations. To resolve this issue, we have generated a new benchmark set with these problems corrected. We have then used the benchmark set to test a number of improvements to the point mutation energetics tools in the Rosetta software suite.
- Published
- 2020
18. Sampling and energy evaluation challenges in ligand binding protein design
- Author
-
Lindsey Doyle, Barry L. Stoddard, David Baker, Per Jr Greisen, Jiayi Dou, Alberto Schena, Hahnbeom Park, and Kai Johnsson
- Subjects
0301 basic medicine ,Ligand efficiency ,010304 chemical physics ,Hydrogen bond ,Ligand ,Chemistry ,Stereochemistry ,Protein design ,Ligand Binding Protein ,01 natural sciences ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,Protein structure ,0103 physical sciences ,Molecule ,Binding site ,Molecular Biology - Abstract
The steroid hormone 17α-hydroxylprogesterone (17-OHP) is a biomarker for congenital adrenal hyperplasia and hence there is considerable interest in development of sensors for this compound. We used computational protein design to generate protein models with binding sites for 17-OHP containing an extended, nonpolar, shape-complementary binding pocket for the four-ring core of the compound, and hydrogen bonding residues at the base of the pocket to interact with carbonyl and hydroxyl groups at the more polar end of the ligand. Eight of 16 designed proteins experimentally tested bind 17-OHP with micromolar affinity. A co-crystal structure of one of the designs revealed that 17-OHP is rotated 180° around a pseudo-two-fold axis in the compound and displays multiple binding modes within the pocket, while still interacting with all of the designed residues in the engineered site. Subsequent rounds of mutagenesis and binding selection improved the ligand affinity to nanomolar range, while appearing to constrain the ligand to a single bound conformation that maintains the same "flipped" orientation relative to the original design. We trace the discrepancy in the design calculations to two sources: first, a failure to model subtle backbone changes which alter the distribution of sidechain rotameric states and second, an underestimation of the energetic cost of desolvating the carbonyl and hydroxyl groups of the ligand. The difference between design model and crystal structure thus arises from both sampling limitations and energy function inaccuracies that are exacerbated by the near two-fold symmetry of the molecule.
- Published
- 2017
- Full Text
- View/download PDF
19. Automatic structure prediction of oligomeric assemblies using Robetta in CASP12
- Author
-
Frank DiMaio, David E. Kim, David Baker, Sergey Ovchinnikov, and Hahnbeom Park
- Subjects
Models, Molecular ,0301 basic medicine ,Structure (mathematical logic) ,Protein Conformation ,Computer science ,Pipeline (computing) ,Computational Biology ,Proteins ,computer.software_genre ,Biochemistry ,Article ,Set (abstract data type) ,03 medical and health sciences ,Crystallography ,030104 developmental biology ,Biological Problem ,Sequence Analysis, Protein ,Structural Biology ,Humans ,Data mining ,Protein Multimerization ,Databases, Protein ,Molecular Biology ,computer ,Software - Abstract
Many naturally occurring protein systems function primarily as symmetric assemblies. Prediction of the quaternary structure of these assemblies is an important biological problem. This manuscript describes automated tools we have developed for predicting the structure of symmetric protein assemblies in the Robetta structure prediction server. We assess the performance of this pipeline on a set of targets from the recent CASP12/CAPRI blind quaternary structure prediction experiment. Our approach successfully predicted five of seven symmetric assemblies in this challenge, and was assessed as the best participating server group, and one of only two groups (human or server) with two predictions judged as high quality by the assessors. We also assess the method on a broader set of 22 natively symmetric CASP12 targets, where we show that oligomeric modeling can improve the accuracy of monomeric structure determination, particularly in highly intertwined oligomers.
- Published
- 2017
- Full Text
- View/download PDF
20. Improved protein structure prediction using predicted interresidue orientations
- Author
-
Zhenling Peng, Ivan Anishchenko, David Baker, Hahnbeom Park, Jianyi Yang, and Sergey Ovchinnikov
- Subjects
0301 basic medicine ,Computer science ,Protein Conformation ,Residual ,Modeling and simulation ,03 medical and health sciences ,Structural bioinformatics ,0302 clinical medicine ,Deep Learning ,Sequence Analysis, Protein ,Range (statistics) ,Animals ,Humans ,Multidisciplinary ,business.industry ,Deep learning ,Protein structure prediction ,Biological Sciences ,030104 developmental biology ,Benchmark (computing) ,Critical assessment ,Artificial intelligence ,business ,Algorithm ,030217 neurology & neurosurgery ,Software - Abstract
The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the “ideality” of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.
- Published
- 2020
21. Improved protein structure prediction using predicted inter-residue orientations
- Author
-
Ivan Anishchenko, Hahnbeom Park, Jianyi Yang, David Baker, Zhenling Peng, and Sergey Ovchinnikov
- Subjects
Quantitative measure ,0303 health sciences ,03 medical and health sciences ,Computer science ,030302 biochemistry & molecular biology ,A protein ,Protein structure prediction ,Energy minimization ,Algorithm ,030304 developmental biology - Abstract
The prediction of inter-residue contacts and distances from co-evolutionary data using deep learning has considerably advanced protein structure prediction. Here we build on these advances by developing a deep residual network for predicting inter-residue orientations in addition to distances, and a Rosetta constrained energy minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on CASP13 and CAMEO derived sets, the method outperforms all previously described structure prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo designed proteins, identifying the key fold determining residues and providing an independent quantitative measure of the “ideality” of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.
- Published
- 2019
- Full Text
- View/download PDF
22. Macromolecular modeling and design in Rosetta: recent methods and frameworks
- Author
-
Jack Maguire, Ragul Gowthaman, Marion F. Sauer, Georg Kuenze, Tanja Kortemme, Benjamin Basanta, Indigo Chris King, Jens Meiler, Rhiju Das, Ora Schueler-Furman, Nicholas A. Marze, Brandon Frenz, Christoffer Norn, Julia Koehler Leman, Jason W. Labonte, Kala Bharath Pilla, Lei Shi, Sergey Lyskov, Brian D. Weitzner, Nir London, Karen R. Khar, Jaume Bonet, Nawsad Alam, Andreas Scheck, Alexander M. Sevy, Lars Malmström, Thomas Huber, Christopher Bystroff, Lior Zimmerman, Lorna Dsilva, Bruno E. Correia, Roland L. Dunbrack, Sergey Ovchinnikov, Rocco Moretti, Scott Horowitz, Phil Bradley, Frank DiMaio, Noah Ollikainen, Brian Kuhlman, Jeffrey J. Gray, Melanie L. Aprahamian, Andrew Leaver-Fay, Santrupti Nerli, Brian Koepnick, Xingjie Pan, Manasi A. Pethe, Andrew M. Watkins, Summer B. Thyme, Enrique Marcos, Vikram Khipple Mulligan, Hahnbeom Park, Po-Ssu Huang, David K. Johnson, Daniel-Adriano Silva, Patrick Barth, Shannon Smith, Caleb Geniesse, Jason K. Lai, Patrick Conway, Amelie Stein, Jeliazko R. Jeliazkov, David Baker, Dominik Gront, Kalli Kappel, Firas Khatib, Robert Kleffner, Brian J. Bender, Richard Bonneau, Kyle A. Barlow, Joseph H. Lubin, Shourya S. Roy Burman, Nikolaos G. Sgourakis, Yuval Sedan, Ryan E. Pavlovicz, Kristin Blacklock, Seth Cooper, Barak Raveh, Alisa Khramushin, John Karanicolas, Justin B. Siegel, Sharon L. Guffy, Brian G. Pierce, Alex Ford, Darwin Y. Fu, Orly Marcu, Gideon Lapidoth, Brian Coventry, René M. de Jong, Shane O’Conchúir, Thomas W. Linsky, William R. Schief, Rebecca F. Alford, Scott E. Boyken, Sagar D. Khare, Maria Szegedy, Ray Yu-Ruei Wang, Steven M. Lewis, Hamed Khakzad, Timothy M. Jacobs, Frank D. Teets, Lukasz Goldschmidt, Daisuke Kuroda, Steffen Lindert, P. Douglas Renfrew, Yifan Song, Jared Adolf-Bryfogle, Michael S. Pacella, and Aliza B. Rubenstein
- Subjects
atomic-accuracy ,Models, Molecular ,Computer science ,Macromolecular Substances ,Protein Conformation ,Interoperability ,computational design ,Score ,antibody structures ,Biochemistry ,Article ,homing endonuclease specificity ,03 medical and health sciences ,Software ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,business.industry ,Proteins ,Usability ,fold determination ,Cell Biology ,Molecular Docking Simulation ,variable region ,Docking (molecular) ,protein-structure prediction ,small-molecule docking ,Modeling and design ,Peptidomimetics ,User interface ,Software engineering ,business ,de-novo design ,sparse nmr data ,Biotechnology - Abstract
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at ., This Perspective reviews tools developed over the past five years in the macromolecular modeling, docking and design software Rosetta.
- Published
- 2019
23. Efficient consideration of coordinated water molecules improves computational protein-protein and protein-ligand docking
- Author
-
Frank DiMaio, Hahnbeom Park, and Ryan E. Pavlovicz
- Subjects
chemistry.chemical_classification ,chemistry ,Protein–ligand docking ,Hydrogen bond ,Docking (molecular) ,Biomolecule ,Protein design ,Water model ,Molecule ,Biological system ,Small molecule - Abstract
Highly-coordinated water molecules are frequently an integral part of protein-protein and protein-ligand interfaces. We introduce an updated energy model that efficiently captures the energetic effects of these highly-coordinated water molecules on the surfaces of proteins. A two-stage protocol is developed in which polar groups arranged in geometries suitable for water placement are first identified, then a modified Monte Carlo simulation allows highly coordinated waters to emerge. This “semi-explicit” water model is implemented in Rosetta and allows for simultaneous prediction of side chain conformation and coordinated water geometry; the approach is suitable for structure prediction and protein design. We show that our new approach and energy model yield significant improvements in native structure recovery of protein-protein and protein-ligand docking. Significance Statement Coordinated water molecules, those forming multiple hydrogen bonds with protein polar groups, play an important role in the structure of and interaction between biomolecules, yet the effect of these waters is often not considered in biomolecular computations. In this paper, we describe a method to efficiently consider these water molecules both implicitly and explicitly at the interfaces formed by two polar molecules. In computations related to determining how a protein interacts with binding partners, we show that the use of this new method significantly improves results. Future application of this approach may improve the design of new protein and small molecule drugs.
- Published
- 2019
- Full Text
- View/download PDF
24. Protein structure determination using metagenome sequence data
- Author
-
Neha Varghese, Hahnbeom Park, Po-Ssu Huang, Hetunandan Kamisetty, David Baker, Nikos C. Kyrpides, David E. Kim, Georgios A. Pavlopoulos, and Sergey Ovchinnikov
- Subjects
Models, Molecular ,0301 basic medicine ,Protein Folding ,Protein family ,Protein Conformation ,Computational biology ,Biology ,Crystallography, X-Ray ,Bioinformatics ,Evolution, Molecular ,03 medical and health sciences ,Protein structure ,Data sequences ,Sequence Analysis, Protein ,Amino Acid Sequence ,Databases, Protein ,Structure matching ,Multidisciplinary ,Computational Biology ,Proteins ,computer.file_format ,Protein Data Bank ,030104 developmental biology ,Membrane protein ,Metagenomics ,Metagenome ,computer ,Algorithms ,Software ,Protein Structure Initiative - Abstract
Filling in the protein fold picture Fewer than a third of the 14,849 known protein families have at least one member with an experimentally determined structure. This leaves more than 5000 protein families with no structural information. Protein modeling using residue-residue contacts inferred from evolutionary data has been successful in modeling unknown structures, but it requires large numbers of aligned sequences. Ovchinnikov et al. augmented such sequence alignments with metagenome sequence data (see the Perspective by Söding). They determined the number of sequences required to allow modeling, developed criteria for model quality, and, where possible, improved modeling by matching predicted contacts to known structures. Their method predicted quality structural models for 614 protein families, of which about 140 represent newly discovered protein folds. Science , this issue p. 294 ; see also p. 248
- Published
- 2017
- Full Text
- View/download PDF
25. Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11
- Author
-
David E. Kim, Ray Yu-Ruei Wang, David Baker, Hahnbeom Park, Yuan Liu, and Sergey Ovchinnikov
- Subjects
0301 basic medicine ,Amino Acid Motifs ,030102 biochemistry & molecular biology ,Chemistry ,Low resolution ,Nuclear Overhauser effect ,Protein structure prediction ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,Protein structure ,Nuclear magnetic resonance ,Structural Biology ,Pairing ,Protein folding ,Molecular Biology ,Algorithm - Abstract
In CASP11 we generated protein structure models using simulated ambiguous and unambiguous nuclear Overhauser effect (NOE) restraints with a two stage protocol. Low resolution models were generated guided by the unambiguous restraints using continuous chain folding for alpha and alpha-beta proteins, and iterative annealing for all beta proteins to take advantage of the strand pairing information implicit in the restraints. The Rosetta fragment/model hybridization protocol was then used to recombine and regularize these models, and refine them in the Rosetta full atom energy function guided by both the unambiguous and the ambiguous restraints. Fifteen out of 19 targets were modeled with GDT-TS quality scores greater than 60 for Model 1, significantly improving upon the non-assisted predictions. Our results suggest that atomic level accuracy is achievable using sparse NOE data when there is at least one correctly assigned NOE for every residue. Proteins 2016; 84(Suppl 1):181-188. © 2016 Wiley Periodicals, Inc.
- Published
- 2016
- Full Text
- View/download PDF
26. Efficient consideration of coordinated water molecules improves computational protein-protein and protein-ligand docking discrimination
- Author
-
Frank DiMaio, Ryan E. Pavlovicz, and Hahnbeom Park
- Subjects
0301 basic medicine ,Statistical methods ,Monte Carlo method ,Protein Structure Prediction ,Ligands ,Physical Chemistry ,Biochemistry ,Molecular Docking Simulation ,0302 clinical medicine ,Protein structure ,Macromolecular Structure Analysis ,Biochemical Simulations ,Amino Acids ,Biology (General) ,Crystallography ,Ecology ,Organic Compounds ,Chemistry ,Physics ,Statistics ,Chemical Reactions ,Protein structure prediction ,Condensed Matter Physics ,Computational Theory and Mathematics ,Modeling and Simulation ,Physical Sciences ,Crystal Structure ,Engineering and Technology ,Biological system ,Algorithms ,Research Article ,Biotechnology ,Protein Binding ,Protein Structure ,QH301-705.5 ,Protein design ,Solvation ,Bioengineering ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Genetics ,Water model ,Solid State Physics ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Binding Sites ,Chemical Bonding ,Organic Chemistry ,Chemical Compounds ,Biology and Life Sciences ,Proteins ,Computational Biology ,Water ,Hydrogen Bonding ,Research and analysis methods ,030104 developmental biology ,Protein–ligand docking ,Small Molecules ,Docking (molecular) ,Mathematical and statistical techniques ,Mathematics ,030217 neurology & neurosurgery - Abstract
Highly coordinated water molecules are frequently an integral part of protein-protein and protein-ligand interfaces. We introduce an updated energy model that efficiently captures the energetic effects of these ordered water molecules on the surfaces of proteins. A two-stage method is developed in which polar groups arranged in geometries suitable for water placement are first identified, then a modified Monte Carlo simulation allows highly coordinated waters to be placed on the surface of a protein while simultaneously sampling amino acid side chain orientations. This “semi-explicit” water model is implemented in Rosetta and is suitable for both structure prediction and protein design. We show that our new approach and energy model yield significant improvements in native structure recovery of protein-protein and protein-ligand docking discrimination tests., Author summary Well-coordinated water molecules—those forming multiple hydrogen bonds with nearby polar groups—play an important role in the structure of biomolecular systems, yet the effect of these waters is often not considered in molecular energy computations. In this paper, we describe a method to efficiently consider these water molecules both implicitly and explicitly at the interfaces formed by two polar molecules. In computations related to determining how a protein interacts with binding partners, we show that the use of this new method significantly improves results. Future application of this approach may improve the design of new protein and small molecule drugs.
- Published
- 2020
- Full Text
- View/download PDF
27. Conditioning by adaptive sampling for robust design
- Author
-
Brookes, D. H., Hahnbeom Park, and Listgarten, J.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
We present a new method for design problems wherein the goal is to maximize or specify the value of one or more properties of interest. For example, in protein design, one may wish to find the protein sequence that maximizes fluorescence. We assume access to one or more, potentially black box, stochastic "oracle" predictive functions, each of which maps from input (e.g., protein sequences) design space to a distribution over a property of interest (e.g. protein fluorescence). At first glance, this problem can be framed as one of optimizing the oracle(s) with respect to the input. However, many state-of-the-art predictive models, such as neural networks, are known to suffer from pathologies, especially for data far from the training distribution. Thus we need to modulate the optimization of the oracle inputs with prior knowledge about what makes `realistic' inputs (e.g., proteins that stably fold). Herein, we propose a new method to solve this problem, Conditioning by Adaptive Sampling, which yields state-of-the-art results on a protein fluorescence problem, as compared to other recently published approaches. Formally, our method achieves its success by using model-based adaptive sampling to estimate the conditional distribution of the input sequences given the desired properties.
- Published
- 2019
- Full Text
- View/download PDF
28. GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors
- Author
-
Chaok Seok, Jonghun Won, Gyu Rie Lee, and Hahnbeom Park
- Subjects
0301 basic medicine ,Models, Molecular ,Loop (graph theory) ,Similarity (geometry) ,Protein Conformation ,General Chemical Engineering ,Ab initio ,Library and Information Sciences ,Protein Structure, Secondary ,Receptors, G-Protein-Coupled ,03 medical and health sciences ,Protein structure ,Animals ,Humans ,Disulfides ,Databases, Protein ,Physics ,030102 biochemistry & molecular biology ,Drug discovery ,Sampling (statistics) ,General Chemistry ,Computer Science Applications ,030104 developmental biology ,Template ,Metric (mathematics) ,Biological system - Abstract
The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In this study, a new ECL2 conformational sampling method involving both template-based and ab initio sampling was developed. Inspired by the observation of similar ECL2 structures of closely related GPCRs, a template-based sampling method employing loop structure templates selected from the structure database was developed. A new metric for evaluating similarity of the target loop to templates was introduced for template selection. An ab initio loop sampling method was also developed to treat cases without highly similar templates. The ab initio method is based on the previously developed fragment assembly and loop closure method. A new sampling component that takes advantage of secondary structure prediction was added. In addition, a conserved disulfide bridge restraining ECL2 conformation was predicted and analytically incorporated into sampling, reducing the effective dimension of the conformational search space. The sampling method was combined with an existing energy function for comparison with previously reported loop structure prediction methods, and the benchmark test demonstrated outstanding performance.
- Published
- 2018
29. De novo design of a fluorescence-activating β-barrel
- Author
-
Lauren Carter, Binchen Mao, Joshua C. Vaughan, Enrique Marcos, Matthew J. Bick, Hahnbeom Park, Po-Ssu Huang, Banumathi Sankaran, Min Yen Lee, William Sheffler, Glenna Wink Foight, Barry L. Stoddard, Sergey Ovchinnikov, David Baker, Anastassia A. Vorobieva, Lindsey Doyle, Lauren A. Gagnon, Jiayi Dou, and Department of Bio-engineering Sciences
- Subjects
0301 basic medicine ,Protein Structure ,Secondary ,Protein Folding ,Globular protein ,General Science & Technology ,Protein domain ,Green Fluorescent Proteins ,Plasma protein binding ,Ligands ,Article ,Fluorescence ,Protein Structure, Secondary ,Cercopithecus aethiops ,03 medical and health sciences ,0302 clinical medicine ,Protein structure ,Protein Domains ,Yeasts ,MD Multidisciplinary ,Benzyl Compounds ,Chlorocebus aethiops ,Escherichia coli ,Animals ,Imidazolines ,chemistry.chemical_classification ,Multidisciplinary ,Ligand ,Protein Stability ,Proteins ,Reproducibility of Results ,Hydrogen Bonding ,Amino acid ,Barrel ,030104 developmental biology ,chemistry ,general ,COS Cells ,Biophysics ,Protein folding ,Generic health relevance ,030217 neurology & neurosurgery ,Protein Binding - Abstract
The regular arrangements of β-strands around a central axis in β-barrels and of α-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of α-helical coiled-coil structures, but to date there has been no success with β-barrels. Here we show that accurate de novo design of β-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of β-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI,and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.
- Published
- 2018
30. High-resolution protein–protein docking by global optimization: recent advances and future challenges
- Author
-
Hahnbeom Park, Chaok Seok, and Hasup Lee
- Subjects
Lead Finder ,Computer science ,Protein protein ,Proteins ,Water ,Nanotechnology ,Computational biology ,Molecular Docking Simulation ,Protein–protein interaction ,Protein–ligand docking ,Structural Biology ,Docking (molecular) ,Humans ,Protein–protein interaction prediction ,Molecular Biology ,Global optimization - Abstract
A computational protein-protein docking method that predicts atomic details of protein-protein interactions from protein monomer structures is an invaluable tool for understanding the molecular mechanisms of protein interactions and for designing molecules that control such interactions. Compared to low-resolution docking, high-resolution docking explores the conformational space in atomic resolution to provide predictions with atomic details. This allows for applications to more challenging docking problems that involve conformational changes induced by binding. Recently, high-resolution methods have become more promising as additional information such as global shapes or residue contacts are now available from experiments or sequence/structure data. In this review article, we highlight developments in high-resolution docking made during the last decade, specifically regarding global optimization methods employed by the docking methods. We also discuss two major challenges in high-resolution docking: prediction of backbone flexibility and water-mediated interactions.
- Published
- 2015
- Full Text
- View/download PDF
31. CASP11 refinement experiments with ROSETTA
- Author
-
Frank DiMaio, David Baker, and Hahnbeom Park
- Subjects
0301 basic medicine ,Protocol (science) ,Structure (mathematical logic) ,Computer science ,business.industry ,Model selection ,Machine learning ,computer.software_genre ,Biochemistry ,Multi-objective optimization ,03 medical and health sciences ,Range (mathematics) ,030104 developmental biology ,Software ,Development (topology) ,Structural Biology ,Data mining ,Artificial intelligence ,business ,Molecular Biology ,computer ,Native structure - Abstract
We report new Rosetta-based approaches to tackling the major issues that confound protein structure refinement, and the testing of these approaches in the CASP11 experiment. Automated refinement protocols were developed that integrate a range of sampling methods using parallel computation and multiobjective optimization. In CASP11, we used a more aggressive large-scale structure rebuilding approach for poor starting models, and a less aggressive local rebuilding plus core refinement approach for starting models likely to be closer to the native structure. The more incorrectly modeled a structure was predicted to be, the more it was allowed to vary during refinement. The CASP11 experiment revealed strengths and weaknesses of the approaches: the high-resolution strategy incorporating local rebuilding with core refinement consistently improved starting structures, while the low-resolution strategy incorporating the reconstruction of large parts of the structures improved starting models in some cases but often considerably worsened them, largely because of model selection issues. Overall, the results suggest the high-resolution refinement protocol is a promising method orthogonal to other approaches, while the low-resolution refinement method clearly requires further development. Proteins 2016; 84(Suppl 1):314-322. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
- Full Text
- View/download PDF
32. Protein structure prediction using Rosetta in CASP12
- Author
-
David E. Kim, Frank DiMaio, David Baker, Sergey Ovchinnikov, and Hahnbeom Park
- Subjects
0301 basic medicine ,Models, Molecular ,Protein Folding ,Computer science ,Protein Conformation ,Ab initio prediction ,Crystallography, X-Ray ,01 natural sciences ,Biochemistry ,Article ,03 medical and health sciences ,Protein structure ,Structural Biology ,Sequence Analysis, Protein ,0103 physical sciences ,Humans ,Molecular Biology ,Native structure ,Simulation ,010304 chemical physics ,Computational Biology ,Proteins ,Protein structure prediction ,030104 developmental biology ,Algorithm ,Algorithms ,Protein Structure Initiative - Abstract
We describe several notable aspects of our structure predictions using Rosetta in CASP12 in the free modeling (FM) and refinement (TR) categories. First, we had previously generated (and published) models for most large protein families lacking experimentally determined structures using Rosetta guided by co-evolution based contact predictions, and for several targets these models proved better starting points for comparative modeling than any known crystal structure-our model database thus starts to fulfill one of the goals of the original protein structure initiative. Second, while our "human" group simply submitted ROBETTA models for most targets, for six targets expert intervention improved predictions considerably; the largest improvement was for T0886 where we correctly parsed two discontinuous domains guided by predicted contact maps to accurately identify a structural homolog of the same fold. Third, Rosetta all atom refinement followed by MD simulations led to consistent but small improvements when starting models were close to the native structure, and larger but less consistent improvements when starting models were further away.
- Published
- 2017
33. Blind prediction of interfacial water positions in CAPRI
- Author
-
Pravin Muthu, Joy Sarmiento, John Wieting, Thom Vreven, Hasup Lee, Dima Kozakov, Haruki Nakamura, Julie C. Mitchell, Juan Fernández-Recio, Haim J. Wolfson, Sergei Grudinin, Yuko Tsuchiya, Iain H. Moal, Efrat Farkash, Chiara Pallara, Petras J. Kundrotas, Howook Hwang, Chaok Seok, Panagiotis L. Kastritis, Hahnbeom Park, Xiaoqin Zou, Junsu Ko, Justyna Aleksandra Wojdyla, Brian G. Pierce, Christophe Schmitz, Colin Kleanthous, Sanbo Qin, Shoshana J. Wodak, Paul A. Bates, Matsuyuki Shirota, Solène Grosdidier, Idit Buch, Ilya A. Vakser, Krishna Praneeth Kilambi, Jianqing Xu, Matthieu Chavent, Sandor Vajda, Adrien S. J. Melquiond, Marc F. Lensink, Shen You Huang, Martin Zacharias, David W. Ritchie, Brian Jiménez-García, Marc van Dijk, Ezgi Karaca, Yoichi Murakami, Daron M. Standley, Albert Solernou, Laura Pérez-Cano, Yang Shen, Miriam Eisenstein, Jeffrey J. Gray, Alexandre M. J. J. Bonvin, Zhiping Weng, Georgy Derevyanko, Kengo Kinoshita, Huan-Xiang Zhou, and Eiji Kanamori
- Subjects
0303 health sciences ,010304 chemical physics ,Chemistry ,01 natural sciences ,Biochemistry ,Molecular Docking Simulation ,Force field (chemistry) ,Protein–protein interaction ,03 medical and health sciences ,Crystallography ,Molecular recognition ,Protein structure ,Structural Biology ,Docking (molecular) ,0103 physical sciences ,Critical assessment ,Macromolecular docking ,Biological system ,Molecular Biology ,030304 developmental biology - Abstract
We report the first assessment of blind predictions of water positions at protein-protein interfaces, performed as part of the critical assessment of predicted interactions (CAPRI) community-wide experiment. Groups submitting docking predictions for the complex of the DNase domain of colicin E2 and Im2 immunity protein (CAPRI Target 47), were invited to predict the positions of interfacial water molecules using the method of their choice. The predictions-20 groups submitted a total of 195 models-were assessed by measuring the recall fraction of water-mediated protein contacts. Of the 176 high- or medium-quality docking models-a very good docking performance per se-only 44% had a recall fraction above 0.3, and a mere 6% above 0.5. The actual water positions were in general predicted to an accuracy level no better than 1.5 A, and even in good models about half of the contacts represented false positives. This notwithstanding, three hotspot interface water positions were quite well predicted, and so was one of the water positions that is believed to stabilize the loop that confers specificity in these complexes. Overall the best interface water predictions was achieved by groups that also produced high-quality docking models, indicating that accurate modelling of the protein portion is a determinant factor. The use of established molecular mechanics force fields, coupled to sampling and optimization procedures also seemed to confer an advantage. Insights gained from this analysis should help improve the prediction of protein-water interactions and their role in stabilizing protein complexes.
- Published
- 2013
- Full Text
- View/download PDF
34. Community-wide evaluation of methods for predicting the effect of mutations on protein-protein interactions
- Author
-
Howook Hwang, Shiyong Liu, Xiaoqin Zou, Huan-Xiang Zhou, Hideaki Umeyama, Paul A. Bates, Hahnbeom Park, Yangyu Huang, Xiaolei Zhu, Marianne Rooman, Rudi Agius, David Baker, Sarel J. Fleishman, Dimitri Gillis, Eiji Kanamori, Yuko Tsuchiya, Sandor Vajda, Panagiotis L. Kastritis, Brian Jimenez, Thom Vreven, Xiufeng Yang, Hiromitsu Shimoyama, Nan Zhao, Zhiping Weng, Sheng-You Huang, Mikael Trellet, Chaok Seok, Samuel C. Flores, Miguel Romero-Durana, Sanbo Qin, Michael S. Pacella, Julie C. Mitchell, Mayuko Takeda-Shitaka, Dmitri Beglov, Jeffrey J. Gray, Shoshana J. Wodak, Rocco Moretti, Martin Zacharias, Dmitry Korkin, Dima Kozakov, João P. G. L. M. Rodrigues, Haruki Nakamura, Juan Esquivel-Rodríguez, Mieczyslaw Torchala, Yves Dehouck, Alexandre M. J. J. Bonvin, David R. Hall, Mitsuo Iwadate, Krishna Praneeth Kilambi, Jamica Sarmiento, Daron M. Standley, Joël Janin, Omar N. A. Demerdash, Brian G. Pierce, Chiara Pallara, Meng Cui, Shusuke Teraguchi, Petr Popov, Hasup Lee, Haotian Li, Juan Fernández-Recio, Laura Pérez-Cano, Sergei Grudinin, Sameer Velankar, Daisuke Kihara, Xiaofeng Ji, Genki Terashi, Yi Xiao, Shide Liang, and Iain H. Moal
- Subjects
Genetics ,0303 health sciences ,Mutation ,010304 chemical physics ,Fitness landscape ,Stability (learning theory) ,Computational biology ,Yeast display ,Biology ,medicine.disease_cause ,01 natural sciences ,Biochemistry ,Deep sequencing ,Protein–protein interaction ,03 medical and health sciences ,Structural Biology ,0103 physical sciences ,medicine ,CASP ,Saturated mutagenesis ,Molecular Biology ,030304 developmental biology - Abstract
Community-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing. The most successful methods explicitly considered the effects of mutation on monomer stability in addition to binding affinity, carried out explicit side-chain sampling and backbone relaxation, evaluated packing, electrostatic, and solvation effects, and correctly identified around a third of the beneficial mutations. Much room for improvement remains for even the best techniques, and large-scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.
- Published
- 2013
- Full Text
- View/download PDF
35. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design
- Author
-
Maxim V. Shapovalov, Matthew J. O’Meara, Vikram Khipple Mulligan, Frank DiMaio, Hahnbeom Park, Jeffrey J. Gray, Andrew Leaver-Fay, Richard Bonneau, Michael S. Pacella, David Baker, Rhiju Das, Kalli Kappel, Jason W. Labonte, Tanja Kortemme, Rebecca F. Alford, Brian Kuhlman, Roland L. Dunbrack, Philip Bradley, Jeliazko R. Jeliazkov, and P. Douglas Renfrew
- Subjects
0301 basic medicine ,Molecular model ,Computer science ,Macromolecular Substances ,Protein Conformation ,media_common.quotation_subject ,Static Electricity ,Nanotechnology ,Crystal structure ,Computational biology ,Molecular Dynamics Simulation ,010402 general chemistry ,01 natural sciences ,Force field (chemistry) ,Article ,03 medical and health sciences ,Molecular dynamics ,Protein structure ,HIV Protease ,Physical and Theoretical Chemistry ,Function (engineering) ,media_common ,chemistry.chemical_classification ,Physics ,Protein therapeutics ,Biomolecule ,computer.file_format ,Small molecule ,Amino acid ,0104 chemical sciences ,Computer Science Applications ,030104 developmental biology ,Membrane protein ,chemistry ,Atom (standard) ,Mutation ,Nucleic acid ,Thermodynamics ,Modeling and design ,computer ,Energy (signal processing) ,Macromolecule - Abstract
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta’s success is the energy function: amodel parameterized from small molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, beta_nov15. Applying these concepts,we explain how to use Rosetta energies to identify and analyze the features of biomolecular models.Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membrane proteins, peptides containing non-canonical amino acids, carbohydrates, nucleic acids, and other macromolecules.
- Published
- 2017
- Full Text
- View/download PDF
36. Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules
- Author
-
David Baker, Yuan Liu, Frank DiMaio, Hahnbeom Park, Vikram Khipple Mulligan, Philip Bradley, David E. Kim, and Per Jr Greisen
- Subjects
0301 basic medicine ,Implicit solvation ,Static Electricity ,Ligands ,01 natural sciences ,Molecular Docking Simulation ,Article ,Quantitative Biology::Subcellular Processes ,03 medical and health sciences ,Protein structure ,Computational chemistry ,0103 physical sciences ,Physical and Theoretical Chemistry ,Range (particle radiation) ,Quantitative Biology::Biomolecules ,010304 chemical physics ,Chemistry ,Quantitative Biology::Molecular Networks ,Proteins ,Hydrogen Bonding ,Function (mathematics) ,Protein structure prediction ,Electrostatics ,Computer Science Applications ,Protein Structure, Tertiary ,030104 developmental biology ,Thermodynamics ,Biological system ,Energy (signal processing) ,Protein Binding - Abstract
Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking, have been parameterized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein-protein and protein-ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties.
- Published
- 2016
37. GalaxyWEB server for protein structure prediction and refinement
- Author
-
Chaok Seok, Lim Heo, Hahnbeom Park, and Junsu Ko
- Subjects
Protein structure database ,Web server ,Loop (graph theory) ,Internet ,business.industry ,Protein Conformation ,Articles ,Protein structure prediction ,Biology ,Bioinformatics ,computer.software_genre ,User-Computer Interface ,Template ,Software ,Protein structure ,Sequence Analysis, Protein ,Server ,Genetics ,business ,Algorithm ,computer - Abstract
Three-dimensional protein structures provide invaluable information for understanding and regulating biological functions of proteins. The GalaxyWEB server predicts protein structure from sequence by template-based modeling and refines loop or terminus regions by ab initio modeling. This web server is based on the method tested in CASP9 (9th Critical Assessment of techniques for protein Structure Prediction) as 'Seok-server', which was assessed to be among top performing template-based modeling servers. The method generates reliable core structures from multiple templates and re-builds unreliable loops or termini by using an optimization-based refinement method. In addition to structure prediction, a user can also submit a refinement only job by providing a starting model structure and locations of loops or termini to refine. The web server can be freely accessed at http://galaxy.seoklab.org/.
- Published
- 2012
38. The FALC-Loop web server for protein loop modeling
- Author
-
Junsu Ko, Dongseon Lee, Hahnbeom Park, Chaok Seok, Julian Lee, and Evangelos A. Coutsias
- Subjects
Models, Molecular ,Web server ,Protein Conformation ,Interface (Java) ,Biology ,Bioinformatics ,computer.software_genre ,01 natural sciences ,03 medical and health sciences ,Software ,0103 physical sciences ,Genetics ,Loop modeling ,Homology modeling ,030304 developmental biology ,For loop ,Internet ,0303 health sciences ,010304 chemical physics ,business.industry ,Articles ,Construct (python library) ,Loop (topology) ,business ,computer ,Algorithm - Abstract
The FALC-Loop web server provides an online interface for protein loop modeling by employing an ab initio loop modeling method called FALC (fragment assembly and analytical loop closure). The server may be used to construct loop regions in homology modeling, to refine unreliable loop regions in experimental structures or to model segments of designed sequences. The FALC method is computationally less expensive than typical ab initio methods because the conformational search space is effectively reduced by the use of fragments derived from a structure database. The analytical loop closure algorithm allows efficient search for loop conformations that fit into the protein framework starting from the fragment-assembled structures. The FALC method shows prediction accuracy comparable to other state-of-the-art loop modeling methods. Top-ranked model structures can be visualized on the web server, and an ensemble of loop structures can be downloaded for further analysis. The web server can be freely accessed at http://falc-loop.seoklab.org/.
- Published
- 2011
- Full Text
- View/download PDF
39. Strength of Cα−H···OC Hydrogen Bonds in Transmembrane Proteins
- Author
-
Chaok Seok, Hahnbeom Park, and Jungki Yoon
- Subjects
Transmembrane domain ,Crystallography ,Hydrogen bond ,Chemistry ,Helix ,Materials Chemistry ,Ab initio ,Side chain ,Molecule ,Crystal structure ,Physical and Theoretical Chemistry ,Transmembrane protein ,Surfaces, Coatings and Films - Abstract
A large number of CR-H‚‚‚O contacts are present in transmembrane protein structures, but contribution of such interactions to protein stability is still not well understood. According to previous ab initio quantum calculations, the stabilization energy of a C R-H‚‚‚O contact is about 2-3 kcal/mol. However, experimental studies on two different CR-H‚‚‚O hydrogen bonds present in transmembrane proteins lead to conclusions that one contact is only weakly stabilizing and the other is not even stabilizing. We note that most previous computational studies were on optimized geometries of isolated molecules, but the experimental measurements were on those in the structural context of transmembrane proteins. In the present study, 263 C R-H‚‚‚OdC contacts in R-helical transmembrane proteins were extracted from X-ray crystal structures, and interaction energies were calculated with quantum mechanical methods. The average stabilization energy of a CRH‚‚‚OdC interaction was computed to be 1.4 kcal/mol. About 13% of contacts were stabilizing by more than 3 kcal/mol, and about 11% were destabilizing. Analysis of the relationships between energy and structure revealed four interaction patterns: three types of attractive cases in which additional C R-H‚‚‚ Oo r N-H‚‚‚O contact is present and a type of repulsive case in which repulsion between two carbonyl oxygen atoms occur. Contribution of CR-H‚‚‚OdC contacts to protein stability is roughly estimated to be greater than 5 kcal/mol per helix pair for about 16% of transmembrane helices but for only 3% of soluble protein helices. The contribution would be larger if CR-H‚‚‚O contacts involving side chain oxygen were also considered.
- Published
- 2007
- Full Text
- View/download PDF
40. Large-scale determination of previously unsolved protein structures using evolutionary information
- Author
-
Sergey Ovchinnikov, Hetunandan Kamisetty, Yuxing Liao, David Baker, David E. Kim, Hahnbeom Park, Lisa N. Kinch, Nick V. Grishin, and Jimin Pei
- Subjects
Models, Molecular ,Protein family ,QH301-705.5 ,Protein Conformation ,Science ,Protein design ,Computational biology ,co-evolution ,Biology ,General Biochemistry, Genetics and Molecular Biology ,Structural genomics ,Evolution, Molecular ,Protein structure ,Bacterial Proteins ,B. subtilis ,Protein function prediction ,Biology (General) ,Genetics ,General Immunology and Microbiology ,General Neuroscience ,S. solfataricus ,E. coli ,Computational Biology ,General Medicine ,Protein structure prediction ,Biophysics and Structural Biology ,protein family ,structure prediction ,protein fold ,Genomics and Evolutionary Biology ,H. salinarum ,Structural biology ,Medicine ,Threading (protein sequence) ,Research Article - Abstract
The prediction of the structures of proteins without detectable sequence similarity to any protein of known structure remains an outstanding scientific challenge. Here we report significant progress in this area. We first describe de novo blind structure predictions of unprecendented accuracy we made for two proteins in large families in the recent CASP11 blind test of protein structure prediction methods by incorporating residue–residue co-evolution information in the Rosetta structure prediction program. We then describe the use of this method to generate structure models for 58 of the 121 large protein families in prokaryotes for which three-dimensional structures are not available. These models, which are posted online for public access, provide structural information for the over 400,000 proteins belonging to the 58 families and suggest hypotheses about mechanism for the subset for which the function is known, and hypotheses about function for the remainder. DOI: http://dx.doi.org/10.7554/eLife.09248.001, eLife digest Proteins are long chains made up of small building blocks called amino acids. These chains fold up in various ways to form the three-dimensional structures that proteins need to be able work properly. Therefore, to understand how a protein works it is important to determine its structure, but this is very challenging. It is possible to predict the structure of a protein with high accuracy if previous experiments have revealed the structure of a similar protein. However, for almost half of all known families of proteins, there are currently no members whose structures have been solved. The three-dimensional shape of a protein is determined by interactions between various amino acids. During evolution, the structure and activity of proteins often remain the same across species, even if the amino acid sequences have changed. This is because pairs of amino acids that interact with each other tend to ‘co-evolve’; that is, if one amino acid changes, then the second amino acid also changes in order to accommodate it. By identifying these pairs of co-evolving amino acids, it is possible to work out which amino acids are close to each other in the three-dimensional structure of the protein. This information can be used to generate a structural model of a protein using computational methods. Now, Ovchinnikov et al. developed a new method to predict the structures of proteins that combines data on the co-evolution of amino acids, as identified by GREMLIN with the structural prediction software called Rosetta. A community-wide experiment called CASP—which tests different methods of protein prediction—showed that, in two cases, this new method works much better than anything previously used to predict the structures of proteins. Ovchinnikov et al. then used this method to make models for proteins belonging to 58 different protein families with currently unknown structures. These predictions were found to be highly accurate and the protein families each have thousands of members, so Ovchinnikov et al.'s findings are expected to be useful to researchers in a wide variety of research areas. A future challenge is to extend these methods to the many protein families that have hundreds rather than thousands of members. DOI: http://dx.doi.org/10.7554/eLife.09248.002
- Published
- 2015
- Full Text
- View/download PDF
41. Author response: Large-scale determination of previously unsolved protein structures using evolutionary information
- Author
-
Jimin Pei, Nick V. Grishin, Hetunandan Kamisetty, Lisa N. Kinch, David Baker, David E. Kim, Hahnbeom Park, Sergey Ovchinnikov, and Yuxing Liao
- Subjects
Scale (ratio) ,Computer science ,Evolutionary information ,Data mining ,computer.software_genre ,computer - Published
- 2015
- Full Text
- View/download PDF
42. Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11
- Author
-
Sergey, Ovchinnikov, Hahnbeom, Park, David E, Kim, Yuan, Liu, Ray Yu-Ruei, Wang, and David, Baker
- Subjects
inorganic chemicals ,Models, Molecular ,Protein Conformation, alpha-Helical ,Internet ,Protein Folding ,Models, Statistical ,International Cooperation ,Amino Acid Motifs ,Computational Biology ,Proteins ,Article ,Computer Simulation ,Protein Conformation, beta-Strand ,Protein Interaction Domains and Motifs ,Databases, Protein ,Nuclear Magnetic Resonance, Biomolecular ,Algorithms ,Software - Abstract
In CASP11 we generated protein structure models using simulated ambiguous and unambiguous nuclear Overhauser effect (NOE) restraints with a two stage protocol. Low resolution models were generated guided by the unambiguous restraints using continuous chain folding for alpha and alpha-beta proteins, and iterative annealing for all beta proteins to take advantage of the strand pairing information implicit in the restraints. The Rosetta fragment/model hybridization protocol was then used to recombine and regularize these models, and refine them in the Rosetta full atom energy function guided by both the unambiguous and the ambiguous restraints. Fifteen out of 19 targets were modeled with GDT-TS quality scores greater than 60 for Model 1, significantly improving upon the non-assisted predictions. Our results suggest that atomic level accuracy is achievable using sparse NOE data when there is at least one correctly assigned NOE for every residue. Proteins 2016; 84(Suppl 1):181-188. © 2016 Wiley Periodicals, Inc.
- Published
- 2015
43. CASP11 refinement experiments with ROSETTA
- Author
-
Hahnbeom, Park, Frank, DiMaio, and David, Baker
- Subjects
Protein Conformation, alpha-Helical ,Internet ,Protein Folding ,Models, Statistical ,Sequence Homology, Amino Acid ,Amino Acid Motifs ,Computational Biology ,Proteins ,Molecular Dynamics Simulation ,Article ,Protein Structure, Tertiary ,Benchmarking ,Humans ,Thermodynamics ,Protein Conformation, beta-Strand ,Protein Interaction Domains and Motifs ,Algorithms ,Software - Abstract
We report new Rosetta-based approaches to tackling the major issues that confound protein structure refinement, and the testing of these approaches in the CASP11 experiment. Automated refinement protocols were developed that integrate a range of sampling methods using parallel computation and multiobjective optimization. In CASP11, we used a more aggressive large-scale structure rebuilding approach for poor starting models, and a less aggressive local rebuilding plus core refinement approach for starting models likely to be closer to the native structure. The more incorrectly modeled a structure was predicted to be, the more it was allowed to vary during refinement. The CASP11 experiment revealed strengths and weaknesses of the approaches: the high-resolution strategy incorporating local rebuilding with core refinement consistently improved starting structures, while the low-resolution strategy incorporating the reconstruction of large parts of the structures improved starting models in some cases but often considerably worsened them, largely because of model selection issues. Overall, the results suggest the high-resolution refinement protocol is a promising method orthogonal to other approaches, while the low-resolution refinement method clearly requires further development. Proteins 2016; 84(Suppl 1):314-322. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
44. Advances in GPCR Modeling Evaluated by the GPCR Dock 2013 Assessment: Meeting New Challenges
- Author
-
Camillo Rosano, Jie Xia, David Rodriguez, Manuel Pastor, Stefano Moro, Ajit Jadhav, Balaji Selvam, Sebastien Fiorucci, Ingebrigt Sylte, Serge Antonczak, Vsevolod Katritch, Serdar Durdagi, Ki Chul Park, Gerard Van Westen, Henri Xhaard, Raymond Stevens, Jiye Shi, Philip Biggin, Davide Sabbadin, SHUGUANG YUAN, Slawomir Filipek, David Gloriam, Antonella Ciancetta, Marek Bajda, Supriyo Bhattacharya, Hahnbeom Park, Jens Carlsson, Aleksandrs Gutcaits, Bartosz Trzaskowski, Jianyi Yang, William Pitt, Dorota Latek, Liliana Halip, Elizabeth Nguyen, Irina Kufareva, Noel Southall, Joaquin Ambia, Woong-Hee Shin, Jana Selent, Uddhavesh Sonavane, Marco Ponassi, Julien Diharce, Jose Manuel Perez-Aguilar, Nicos Petasis, Sergei Grudinin, Skaggs School of Pharmacy and Pharmaceutical Sciences [San Diego], University of California [San Diego] (UC San Diego), University of California-University of California, Department of Molecular Biology [San Diego], The Scripps Research Institute [La Jolla], University of California-University of California-University of California [San Diego] (UC San Diego), Algorithms for Modeling and Simulation of Nanosystems (NANO-D), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK), Centre National de la Recherche Scientifique (CNRS)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Université Joseph Fourier - Grenoble 1 (UJF)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Centre National de la Recherche Scientifique (CNRS)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Université Joseph Fourier - Grenoble 1 (UJF)-Université Pierre Mendès France - Grenoble 2 (UPMF), Institut de Chimie de Nice (ICN), Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), University of California (UC)-University of California (UC), The Scripps Research Institute [La Jolla, San Diego], Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Laboratoire Jean Kuntzmann (LJK ), and Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
- Subjects
Models, Molecular ,Protein Conformation ,Bioinformatics ,01 natural sciences ,Molecular Docking Simulation ,Receptors, G-Protein-Coupled ,Protein structure ,Ligand docking ,Models ,Structural Biology ,Receptors ,0303 health sciences ,Crystallography ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Structural Biology [q-bio.BM] ,Homology modeling ,Biological Sciences ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Biomolecules [q-bio.BM] ,Smoothened Receptor ,Structure-based drug discovery ,Protein Binding ,Serotonin ,1.1 Normal biological development and functioning ,Allosteric regulation ,Biophysics ,Computational biology ,Biology ,010402 general chemistry ,Article ,G-Protein-Coupled ,03 medical and health sciences ,Underpinning research ,Information and Computing Sciences ,DOCK ,Structure prediction ,Humans ,Molecular Biology ,030304 developmental biology ,G protein-coupled receptor ,Conformational sampling ,Pharmacology ,Participants of GPCR Dock 2013 ,Binding Sites ,Molecular ,[INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation ,0104 chemical sciences ,Docking (molecular) ,Receptors, Serotonin ,Chemical Sciences ,G-protein coupled receptor ,[SDV.SP.PHARMA]Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology ,Generic health relevance ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] - Abstract
International audience; Despite tremendous successes of GPCR crystallography, the receptors with available structures represent only a small fraction of human GPCRs. An important role of the modeling community is to maximize structural insights for the remaining receptors and complexes. The community-wide GPCR Dock assessment was established to stimulate and monitor the progress in molecular modeling and ligand docking for GPCRs. The four targets in the present third assessment round presented new and diverse challenges for modelers, including prediction of allosteric ligand interaction and activation states in 5-hydroxytryptamine receptors 1B and 2B, and modeling by extremely distant homology for smoothened receptor. Forty-four modeling groups participated in the assessment. State-of-the-art modeling approaches achieved close-to-experimental accuracy for small rigid orthosteric ligands and models built by close homology, and they correctly predicted protein fold for distant homology targets. Predictions of long loops and GPCR activation states remain unsolved problems.
- Published
- 2014
- Full Text
- View/download PDF
45. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments
- Author
-
Chaok Seok, Lim Heo, Gyu Rie Lee, and Hahnbeom Park
- Subjects
Models, Molecular ,Protein Structure ,Protein Conformation ,Protein design ,New energy ,lcsh:Medicine ,Protein Structure Prediction ,Crystallography, X-Ray ,Bioinformatics ,Biochemistry ,Force field (chemistry) ,Computational Chemistry ,Macromolecular Structure Analysis ,Loop modeling ,lcsh:Science ,Molecular Biology ,Global optimization ,Physics ,Multidisciplinary ,lcsh:R ,Biology and Life Sciences ,Proteins ,Computational Biology ,Reproducibility of Results ,Hybrid energy ,Ranging ,Protein structure prediction ,Chemistry ,Kinetics ,Physical Sciences ,Thermodynamics ,lcsh:Q ,Algorithm ,Algorithms ,Research Article - Abstract
Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
- Published
- 2014
46. GalaxyRefine: Protein structure refinement driven by side-chain repacking
- Author
-
Chaok Seok, Hahnbeom Park, and Lim Heo
- Subjects
Web server ,Internet ,business.industry ,Protein Conformation ,media_common.quotation_subject ,Structure (category theory) ,Relaxation (iterative method) ,Articles ,Protein structure prediction ,Biology ,Molecular Dynamics Simulation ,Bioinformatics ,computer.software_genre ,Software ,Protein structure ,Server ,Genetics ,Quality (business) ,business ,computer ,Algorithm ,media_common - Abstract
The quality of model structures generated by contemporary protein structure prediction methods strongly depends on the degree of similarity between the target and available template structures. Therefore, the importance of improving template-based model structures beyond the accuracy available from template information has been emphasized in the structure prediction community. The GalaxyRefine web server, freely available at http://galaxy.seoklab.org/refine, is based on a refinement method that has been successfully tested in CASP10. The method first rebuilds side chains and performs side-chain repacking and subsequent overall structure relaxation by molecular dynamics simulation. According to the CASP10 assessment, this method showed the best performance in improving the local structure quality. The method can improve both global and local structure quality on average, when used for refining the models generated by state-of-the-art protein structure prediction servers.
- Published
- 2013
47. GalaxyGemini: a web server for protein homo-oligomer structure prediction based on similarity
- Author
-
Hahnbeom Park, Chaok Seok, Hasup Lee, and Junsu Ko
- Subjects
Statistics and Probability ,Protein structure database ,Web server ,Sequence analysis ,Sequence (biology) ,Computational biology ,Biology ,computer.software_genre ,Biochemistry ,Oligomer ,chemistry.chemical_compound ,Protein structure ,Sequence Analysis, Protein ,Molecular Biology ,Internet ,Sequence Homology, Amino Acid ,Proteins ,Protein structure prediction ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,chemistry ,Structural Homology, Protein ,Protein quaternary structure ,Data mining ,Protein Multimerization ,computer ,Software - Abstract
Summary: A large number of proteins function as homo-oligomers; therefore, predicting homo-oligomeric structure of proteins is of primary importance for understanding protein function at the molecular level. Here, we introduce a web server for prediction of protein homo-oligomer structure. The server takes a protein monomer structure as input and predicts its homo-oligomer structure from oligomer templates selected based on sequence and tertiary/quaternary structure similarity. Using protein model structures as input, the server shows clear improvement over the best methods of CASP9 in predicting oligomeric structures from amino acid sequences. Availability: http://galaxy.seoklab.org/gemini. Contact: chaok@snu.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2013
48. Sampling of GPCR Second Extracellular Loops using Geometric Constraints
- Author
-
Hahnbeom Park and Chaok Seok
- Subjects
Transmembrane domain ,Loop closure ,Docking (molecular) ,Structural similarity ,Computer science ,Disulfide bond ,Biophysics ,Bioinformatics ,Global optimization ,Algorithm ,Large size ,G protein-coupled receptor - Abstract
Second extracellular loops (ECL2) of G protein-coupled receptors (GPCR) are known to play important roles by accommodating various GPCR ligands and providing ligand specificity. Despite the structural similarity among GPCR proteins, ECL2 structure is particularly hard to predict because of the relatively large size and ill-conserved sequence. In this study, we developed an efficient sampling algorithm for GPCR ECL2 that utilizes geometric constraints specific for GPCR. Two applications of the triaxial loop closure algorithm were employed to sample geometrically plausible ECL2 conformations that form a well-conserved disulfide bond with a particular transmembrane helix. Scores based on geometric constraints that effectively describe ECL2 environment were introduced to facilitate filtering of implausible ECL2 structures. All of these components are purely geometric, hence sampling and filtering can be performed with extremely low computational cost. A benchmark test was performed on seven unique GPCRs for which all-atom structures have been revealed. The result shows that the best model out of 50 sampled structures is of acceptable accuracy with the median loop RMSD less than 5 A. Combined with energy-guided global optimization, further refined ECL2 structures could be obtained. New ideas introduced in this study may be useful for developing methodologies for further GPCR modeling and docking studies.
- Published
- 2012
- Full Text
- View/download PDF
49. Refinement of unreliable local regions in template-based protein models
- Author
-
Hahnbeom Park and Chaok Seok
- Subjects
Models, Molecular ,Sequence ,Model refinement ,Computer science ,Protein Conformation ,Computational Biology ,Proteins ,Protein structure prediction ,Models, Theoretical ,computer.software_genre ,Biochemistry ,Structural Biology ,Modelling methods ,Protein model ,Humans ,Critical assessment ,Loop modeling ,Data mining ,Template based ,Molecular Biology ,Algorithm ,computer ,Algorithms ,Software - Abstract
Contemporary template-based modeling techniques allow applications of modeling methods to vast biological problems. However, they tend to fail to provide accurate structures for less-conserved local regions in sequence even when the overall structure can be modeled reliably. We call these regions unreliable local regions (ULRs). Accurate modeling of ULRs is of enormous value because they are frequently involved in functional specificity. In this article, we introduce a new method for modeling ULRs in template-based models by employing a sophisticated loop modeling technique. Combined with our previous study on protein termini, the method is applicable to refinement of both loop and terminus ULRs. A large-scale test carried out in a blind fashion in CASP9 (the 9th Critical Assessment of techniques for protein structure prediction) shows that ULR structures are improved over initial template-based models by refinement in more than 70% of the successfully detected ULRs. It is also notable that successful modeling of several long ULRs over 12 residues is achieved. Overall, the current results show that a careful application of loop and terminus modeling can be a promising tool for model refinement in template-based modeling.
- Published
- 2011
50. Community-Wide Assessment of Protein-Interface Modeling Suggests Improvements to Design Methodology
- Author
-
Libin Cao, Anne Poupon, Brian G. Pierce, Howook Hwang, Ying Chen, Victor L. Hsu, Hasup Lee, Yangyu Huang, Daisuke Kihara, Juan Fernández-Recio, Vladimir Potapov, Aroop Sircar, Chaok Seok, Timothy A. Whitehead, Jérôme Azé, Nir Ben Tal, Seren Soner, Brian Kuhlman, P. Benjamin Stranges, Nobuyuki Uchikoga, Sanbo Qin, Xinqi Gong, Yi Xiao, Carlos J. Camacho, Yaoqi Zhou, Gideon Schreiber, Ora Schueler-Furman, Paul A. Bates, Krishna Praneeth Kilambi, Joël Janin, Mati Cohen, Julie C. Mitchell, Panwen Wang, Cunxin Wang, Raed Khashan, Mayuko Takeda-Shitaka, Lin Li, Martin Zacharias, Alexander Tropsha, Genki Terashi, Xiaofan Li, David Baker, Jian Zhan, Julie Bernauer, Zohar Itzhaki, Mainak Guharoy, Eva-Maria Strauch, Xiaoqin Zou, Thom Vreven, Hahnbeom Park, Sheng-You Huang, Stephen Bush, Daron M. Standley, Feng Yang, Yuko Tsuchiya, Fan Jiang, Jacob E. Corn, Takashi Ishida, Chunhua Li, Junsu Ko, Robert G. Hall, Thomas Bourquard, Iain H. Moal, Weiyi Zhang, C.M. Driggers, Nir London, Jessica L. Morgan, Ron Jacak, Haruki Nakamura, Laura Pérez-Cano, Denis Fouches, Bin Liu, Yutaka Akiyama, Omar N. A. Demerdash, Yuval Inbar, Xianjin Xu, Yuedong Yang, Dachuan Guo, Masahito Ohue, Turkan Haliloglu, Jeffrey J. Gray, Juan Esquivel-Rodríguez, Alexandre M. J. J. Bonvin, Pemra Ozbek, Sarel J. Fleishman, Şefik Kerem Ovali, Charles H. Robert, Huan-Xiang Zhou, Eiji Kanamori, Yuri Matsuzaki, Carles Pons, Zhiping Weng, Kengo Kinoshita, Shoshana J. Wodak, Shiyong Liu, Panagiotis L. Kastritis, University of Washington [Seattle], Institute of Molecular Biophysics [Tallahassee], Florida State University [Tallahassee] (FSU), University of Wisconsin Whitewater, Kitasato University, Biomolecular Modelling laboratory [London], Cancer Research UK London Research Institute, Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM), Seoul National University [Seoul] (SNU), Knowledge representation, reasonning (ORPAILLEUR), INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS), Algorithms and Models for Integrative Biology (AMIB ), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Physiologie de la reproduction et des comportements [Nouzilly] (PRC), Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur]-Université de Tours (UT)-Centre National de la Recherche Scientifique (CNRS), Department of Chemical Engineering [Bogazici] (ChE), Boǧaziçi üniversitesi = Boğaziçi University [Istanbul], Tel Aviv University [Tel Aviv], University of Massachusetts Medical School [Worcester] (UMASS), University of Massachusetts System (UMASS), Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Chinese Academy of Sciences [Beijing] (CAS), Beijing University of Technology, Laboratoire de biochimie théorique [Paris] (LBT (UPR_9080)), Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS)-Institut de biologie physico-chimique (IBPC (FR_550)), Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC), Huazhong University of Science and Technology [Wuhan] (HUST), Hadassah Hebrew University Medical Center [Jerusalem], Weizmann Institute of Science [Rehovot, Israël], Institute for Protein Research [Osaka], Osaka University [Osaka], Japan Biological Informatics Consortium [Tokyo], WPI Immunology Frontier Research Center (IFREC), Graduate School of Information Sciences [Sendai], Tohoku University [Sendai], Oregon State University (OSU), Indiana University - Purdue University Indianapolis (IUPUI), Indiana University System, Bijvoet Center for Biomolecular Research [Utrecht], Utrecht University [Utrecht], University of Pittsburgh (PITT), Pennsylvania Commonwealth System of Higher Education (PCSHE), Johns Hopkins University (JHU), Tokyo Institute of Technology [Tokyo] (TITECH), University of North Carolina [Chapel Hill] (UNC), University of North Carolina System (UNC), Purdue University [West Lafayette], Dalton Cardiovascular Research Center [Columbia], University of Missouri [Columbia] (Mizzou), University of Missouri System-University of Missouri System, The Hospital for sick children [Toronto] (SickKids), Institut de biochimie et biophysique moléculaire et cellulaire (IBBMC), Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS), The authors thank Sameer Velankar and Marc Lensink for their help in coordinating this experiment and Raik Grunberg for many helpful suggestions on a draft. S.J.F. was supported by a long-term fellowship from the Human Frontier Science Program. S.J.W. is Canada Research Chair Tier 1, funded by the Canadian Institutes of Health Research. Research in the Baker laboratory was supported by the Howard Hughes Medical Institute, the Defense Advanced Research Projects Agency, the National Institutes of Health Yeast Resource Center, and the Defense Threat Reduction Agency., Technical University of Munich (TUM), Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Laboratoire de Recherche en Informatique (LRI), Centre National de la Recherche Scientifique (CNRS)-Université de Tours-Institut Français du Cheval et de l'Equitation [Saumur]-Institut National de la Recherche Agronomique (INRA), Boğaziçi University [Istanbul], Centre National de la Recherche Scientifique (CNRS)-Institut de biologie physico-chimique (IBPC), Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7), Weizmann Institute of Science, University of Missouri [Columbia], Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur] (IFCE)-Université de Tours (UT)-Centre National de la Recherche Scientifique (CNRS), Tel Aviv University (TAU), Institut de biologie physico-chimique (IBPC (FR_550)), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS), Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur]-Université de Tours-Centre National de la Recherche Scientifique (CNRS), Université Paris Diderot - Paris 7 (UPD7)-Institut de biologie physico-chimique (IBPC), and Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Models, Molecular ,biochemistry and molecular biology ,Computer science ,Protein design ,Nanotechnology ,Machine learning ,computer.software_genre ,Article ,03 medical and health sciences ,Structural Biology ,protein protein interactions ,Taverne ,conformational plasticity ,Computational design ,Macromolecular docking ,CASP ,Design methods ,Molecular Biology ,030304 developmental biology ,Protein interface ,0303 health sciences ,Binding Sites ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Structural Biology [q-bio.BM] ,business.industry ,030302 biochemistry & molecular biology ,Proteins ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Biomolecules [q-bio.BM] ,negative design ,Docking (molecular) ,computational protein design ,Critical assessment ,Artificial intelligence ,business ,computer ,Protein Binding - Abstract
International audience; The CAPRI (Critical Assessment of Predicted Interactions) and CASP (Critical Assessment of protein Structure Prediction) experiments have demonstrated the power of community-wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community-wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting that there may be important physical chemistry missing in the energy calculations. A total of 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the nonpolar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were, on average, structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a nonbinder.
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.