21 results on '"Hahnbeom Park"'
Search Results
2. Accurate prediction of protein structures and interactions using a three-track neural network
- Author
-
Jose Henrique Pereira, Ana C. Ebrecht, Lisa N. Kinch, R. Dustin Schaeffer, Ivan Anishchenko, Justas Dauparas, Udit Dalwadi, Gyu Rie Lee, Christoph Buhlheller, Diederik J. Opperman, David Baker, Tea Pavkov-Keller, Qian Cong, Caleb R. Glassman, Alberdina A. van Dijk, Jue Wang, Andria V. Rodrigues, Theo Sagmeister, Randy J. Read, Andy DeGiovanni, Hahnbeom Park, Paul D. Adams, Calvin K. Yip, Frank DiMaio, John E. Burke, Claudia Millán, K. Christopher Garcia, Carson Adams, Minkyung Baek, Nick V. Grishin, Sergey Ovchinnikov, and Manoj K. Rathinaswamy
- Subjects
Structure (mathematical logic) ,0303 health sciences ,Sequence ,Network architecture ,Multidisciplinary ,Artificial neural network ,business.industry ,Computer science ,Deep learning ,computer.software_genre ,Modeling and simulation ,03 medical and health sciences ,Structural bioinformatics ,0302 clinical medicine ,Data mining ,Artificial intelligence ,business ,Distance transform ,computer ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.
- Published
- 2021
- Full Text
- View/download PDF
3. Protein tertiary structure prediction and refinement using deep learning and Rosetta in <scp>CASP14</scp>
- Author
-
Hahnbeom Park, David Baker, David E. Kim, Naozumi Hiranuma, Ivan Anishchenko, Ian R. Humphreys, Justas Dauparas, Minkyung Baek, and Sanaa Mansoor
- Subjects
Similarity (geometry) ,Computer science ,Orientation (computer vision) ,business.industry ,Deep learning ,Pipeline (computing) ,Computational Biology ,Proteins ,Protein structure prediction ,Biochemistry ,Protein Structure, Tertiary ,Deep Learning ,Sequence Analysis, Protein ,Structural Biology ,Benchmark (computing) ,Humans ,Metagenome ,Artificial intelligence ,Language model ,business ,CASP ,Molecular Biology ,Algorithm ,Software - Abstract
The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.
- Published
- 2021
- Full Text
- View/download PDF
4. High‐accuracy refinement using Rosetta in CASP13
- Author
-
Gyu Rie Lee, Qian Cong, Hahnbeom Park, Ivan Anishchenko, David E. Kim, and David Baker
- Subjects
Models, Molecular ,Protein Folding ,Fold (higher-order function) ,Protein Conformation ,Computer science ,media_common.quotation_subject ,Biochemistry ,Article ,03 medical and health sciences ,Structural Biology ,Energy level ,Search problem ,Quality (business) ,Molecular Biology ,030304 developmental biology ,media_common ,Structure (mathematical logic) ,0303 health sciences ,030302 biochemistry & molecular biology ,Computational Biology ,Proteins ,Reproducibility of Results ,Function (mathematics) ,Protein structure prediction ,Thermodynamics ,Algorithm ,Algorithms ,Energy (signal processing) - Abstract
Because proteins generally fold to their lowest free energy states, energy-guided refinement in principle should be able to systematically improve the quality of protein structure models generated using homologous structure or co-evolution derived information. However, because of the high dimensionality of the search space, there are far more ways to degrade the quality of a near native model than to improve it, and hence, refinement methods are very sensitive to energy function errors. In the 13th Critial Assessment of techniques for protein Structure Prediction (CASP13), we sought to carry out a thorough search for low energy states in the neighborhood of a starting model using restraints to avoid straying too far. The approach was reasonably successful in improving both regions largely incorrect in the starting models as well as core regions that started out closer to the correct structure. Models with GDT-HA over 70 were obtained for five targets and for one of those, an accuracy of 0.5 å backbone root-mean-square deviation (RMSD) was achieved. An important current challenge is to improve performance in refining oligomers and larger proteins, for which the search problem remains extremely difficult.
- Published
- 2019
- Full Text
- View/download PDF
5. Author response for 'Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14'
- Author
-
Justas Dauparas, David Baker, David E. Kim, Minkyung Baek, Ian R. Humphreys, Naozumi Hiranuma, Ivan Anishchenko, Hahnbeom Park, and Sanaa Mansoor
- Subjects
business.industry ,Computer science ,Deep learning ,Artificial intelligence ,business ,computer.software_genre ,computer ,Protein tertiary structure ,Natural language processing - Published
- 2021
- Full Text
- View/download PDF
6. Accurate prediction of protein structures and interactions using a 3-track network
- Author
-
Nick V. Grishin, Minkyung Baek, Udit Dalwadi, Gyu Rie Lee, Hahnbeom Park, Carson Adams, van Dijk Aa, Manoj K. Rathinaswamy, Theo Sagmeister, Qian Cong, Frank DiMaio, Randy J. Read, David Baker, Paul D. Adams, Sergey Ovchinnikov, Buhlheller C, Calvin K. Yip, Caleb R. Glassman, Ivan Anishchenko, Schaeffer Rd, Claudia Millán, Diederik J. Opperman, Tea Pavkov-Keller, Jose Henrique Pereira, Ana C. Ebrecht, Lisa N. Kinch, Jing Wang, John E. Burke, Kenan Christopher Garcia, Andria V. Rodrigues, Justas Dauparas, and Andy DeGiovanni
- Subjects
Structure (mathematical logic) ,Network architecture ,Sequence ,Protein structure ,Computer science ,Data mining ,Track (rail transport) ,Protein structure modeling ,computer.software_genre ,computer ,Distance transform - Abstract
DeepMind presented remarkably accurate protein structure predictions at the CASP14 conference. We explored network architectures incorporating related ideas and obtained the best performance with a 3-track network in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The 3-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables rapid solution of challenging X-ray crystallography and cryo-EM structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate models of protein-protein complexes from sequence information alone, short circuiting traditional approaches which require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.One-Sentence SummaryAccurate protein structure modeling enables rapid solution of structure determination problems and provides insights into biological function.
- Published
- 2021
- Full Text
- View/download PDF
7. Prediction of Protein Mutational Free Energy: Benchmark and Sampling Improvements Increase Classification Accuracy
- Author
-
Frank DiMaio, Steven M. Lewis, Yifan Song, Hahnbeom Park, Brandon Frenz, and Indigo Chris King
- Subjects
0301 basic medicine ,Histology ,protein design and engineering ,Computer science ,lcsh:Biotechnology ,Biomedical Engineering ,Bioengineering ,02 engineering and technology ,computer.software_genre ,03 medical and health sciences ,thermodynamics ,Software ,Protein stability ,lcsh:TP248.13-248.65 ,Methods ,Alanine ,Software suite ,business.industry ,Point mutation ,Experimental data ,Bioengineering and Biotechnology ,mutation free energy ,021001 nanoscience & nanotechnology ,030104 developmental biology ,Data mining ,mutation ,0210 nano-technology ,business ,protein ,computer ,Biotechnology - Abstract
Software to predict the change in protein stability upon point mutation is a valuable tool for a number of biotechnological and scientific problems. To facilitate the development of such software and provide easy access to the available experimental data, the ProTherm database was created. Biases in the methods and types of information collected has led to disparity in the types of mutations for which experimental data is available. For example, mutations to alanine are hugely overrepresented whereas those involving charged residues, especially from one charged residue to another, are underrepresented. ProTherm subsets created as benchmark sets that do not account for this often underrepresented certain mutational types. This issue introduces systematic biases into previously published protocols’ ability to accurately predict the change in folding energy on these classes of mutations. To resolve this issue, we have generated a new benchmark set with these problems corrected. We have then used the benchmark set to test a number of improvements to the point mutation energetics tools in the Rosetta software suite.
- Published
- 2020
8. Automatic structure prediction of oligomeric assemblies using Robetta in CASP12
- Author
-
Frank DiMaio, David E. Kim, David Baker, Sergey Ovchinnikov, and Hahnbeom Park
- Subjects
Models, Molecular ,0301 basic medicine ,Structure (mathematical logic) ,Protein Conformation ,Computer science ,Pipeline (computing) ,Computational Biology ,Proteins ,computer.software_genre ,Biochemistry ,Article ,Set (abstract data type) ,03 medical and health sciences ,Crystallography ,030104 developmental biology ,Biological Problem ,Sequence Analysis, Protein ,Structural Biology ,Humans ,Data mining ,Protein Multimerization ,Databases, Protein ,Molecular Biology ,computer ,Software - Abstract
Many naturally occurring protein systems function primarily as symmetric assemblies. Prediction of the quaternary structure of these assemblies is an important biological problem. This manuscript describes automated tools we have developed for predicting the structure of symmetric protein assemblies in the Robetta structure prediction server. We assess the performance of this pipeline on a set of targets from the recent CASP12/CAPRI blind quaternary structure prediction experiment. Our approach successfully predicted five of seven symmetric assemblies in this challenge, and was assessed as the best participating server group, and one of only two groups (human or server) with two predictions judged as high quality by the assessors. We also assess the method on a broader set of 22 natively symmetric CASP12 targets, where we show that oligomeric modeling can improve the accuracy of monomeric structure determination, particularly in highly intertwined oligomers.
- Published
- 2017
- Full Text
- View/download PDF
9. Improved protein structure prediction using predicted interresidue orientations
- Author
-
Zhenling Peng, Ivan Anishchenko, David Baker, Hahnbeom Park, Jianyi Yang, and Sergey Ovchinnikov
- Subjects
0301 basic medicine ,Computer science ,Protein Conformation ,Residual ,Modeling and simulation ,03 medical and health sciences ,Structural bioinformatics ,0302 clinical medicine ,Deep Learning ,Sequence Analysis, Protein ,Range (statistics) ,Animals ,Humans ,Multidisciplinary ,business.industry ,Deep learning ,Protein structure prediction ,Biological Sciences ,030104 developmental biology ,Benchmark (computing) ,Critical assessment ,Artificial intelligence ,business ,Algorithm ,030217 neurology & neurosurgery ,Software - Abstract
The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the “ideality” of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.
- Published
- 2020
10. Improved protein structure prediction using predicted inter-residue orientations
- Author
-
Ivan Anishchenko, Hahnbeom Park, Jianyi Yang, David Baker, Zhenling Peng, and Sergey Ovchinnikov
- Subjects
Quantitative measure ,0303 health sciences ,03 medical and health sciences ,Computer science ,030302 biochemistry & molecular biology ,A protein ,Protein structure prediction ,Energy minimization ,Algorithm ,030304 developmental biology - Abstract
The prediction of inter-residue contacts and distances from co-evolutionary data using deep learning has considerably advanced protein structure prediction. Here we build on these advances by developing a deep residual network for predicting inter-residue orientations in addition to distances, and a Rosetta constrained energy minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on CASP13 and CAMEO derived sets, the method outperforms all previously described structure prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo designed proteins, identifying the key fold determining residues and providing an independent quantitative measure of the “ideality” of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.
- Published
- 2019
- Full Text
- View/download PDF
11. Macromolecular modeling and design in Rosetta: recent methods and frameworks
- Author
-
Jack Maguire, Ragul Gowthaman, Marion F. Sauer, Georg Kuenze, Tanja Kortemme, Benjamin Basanta, Indigo Chris King, Jens Meiler, Rhiju Das, Ora Schueler-Furman, Nicholas A. Marze, Brandon Frenz, Christoffer Norn, Julia Koehler Leman, Jason W. Labonte, Kala Bharath Pilla, Lei Shi, Sergey Lyskov, Brian D. Weitzner, Nir London, Karen R. Khar, Jaume Bonet, Nawsad Alam, Andreas Scheck, Alexander M. Sevy, Lars Malmström, Thomas Huber, Christopher Bystroff, Lior Zimmerman, Lorna Dsilva, Bruno E. Correia, Roland L. Dunbrack, Sergey Ovchinnikov, Rocco Moretti, Scott Horowitz, Phil Bradley, Frank DiMaio, Noah Ollikainen, Brian Kuhlman, Jeffrey J. Gray, Melanie L. Aprahamian, Andrew Leaver-Fay, Santrupti Nerli, Brian Koepnick, Xingjie Pan, Manasi A. Pethe, Andrew M. Watkins, Summer B. Thyme, Enrique Marcos, Vikram Khipple Mulligan, Hahnbeom Park, Po-Ssu Huang, David K. Johnson, Daniel-Adriano Silva, Patrick Barth, Shannon Smith, Caleb Geniesse, Jason K. Lai, Patrick Conway, Amelie Stein, Jeliazko R. Jeliazkov, David Baker, Dominik Gront, Kalli Kappel, Firas Khatib, Robert Kleffner, Brian J. Bender, Richard Bonneau, Kyle A. Barlow, Joseph H. Lubin, Shourya S. Roy Burman, Nikolaos G. Sgourakis, Yuval Sedan, Ryan E. Pavlovicz, Kristin Blacklock, Seth Cooper, Barak Raveh, Alisa Khramushin, John Karanicolas, Justin B. Siegel, Sharon L. Guffy, Brian G. Pierce, Alex Ford, Darwin Y. Fu, Orly Marcu, Gideon Lapidoth, Brian Coventry, René M. de Jong, Shane O’Conchúir, Thomas W. Linsky, William R. Schief, Rebecca F. Alford, Scott E. Boyken, Sagar D. Khare, Maria Szegedy, Ray Yu-Ruei Wang, Steven M. Lewis, Hamed Khakzad, Timothy M. Jacobs, Frank D. Teets, Lukasz Goldschmidt, Daisuke Kuroda, Steffen Lindert, P. Douglas Renfrew, Yifan Song, Jared Adolf-Bryfogle, Michael S. Pacella, and Aliza B. Rubenstein
- Subjects
atomic-accuracy ,Models, Molecular ,Computer science ,Macromolecular Substances ,Protein Conformation ,Interoperability ,computational design ,Score ,antibody structures ,Biochemistry ,Article ,homing endonuclease specificity ,03 medical and health sciences ,Software ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,business.industry ,Proteins ,Usability ,fold determination ,Cell Biology ,Molecular Docking Simulation ,variable region ,Docking (molecular) ,protein-structure prediction ,small-molecule docking ,Modeling and design ,Peptidomimetics ,User interface ,Software engineering ,business ,de-novo design ,sparse nmr data ,Biotechnology - Abstract
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at ., This Perspective reviews tools developed over the past five years in the macromolecular modeling, docking and design software Rosetta.
- Published
- 2019
12. High-resolution protein–protein docking by global optimization: recent advances and future challenges
- Author
-
Hahnbeom Park, Chaok Seok, and Hasup Lee
- Subjects
Lead Finder ,Computer science ,Protein protein ,Proteins ,Water ,Nanotechnology ,Computational biology ,Molecular Docking Simulation ,Protein–protein interaction ,Protein–ligand docking ,Structural Biology ,Docking (molecular) ,Humans ,Protein–protein interaction prediction ,Molecular Biology ,Global optimization - Abstract
A computational protein-protein docking method that predicts atomic details of protein-protein interactions from protein monomer structures is an invaluable tool for understanding the molecular mechanisms of protein interactions and for designing molecules that control such interactions. Compared to low-resolution docking, high-resolution docking explores the conformational space in atomic resolution to provide predictions with atomic details. This allows for applications to more challenging docking problems that involve conformational changes induced by binding. Recently, high-resolution methods have become more promising as additional information such as global shapes or residue contacts are now available from experiments or sequence/structure data. In this review article, we highlight developments in high-resolution docking made during the last decade, specifically regarding global optimization methods employed by the docking methods. We also discuss two major challenges in high-resolution docking: prediction of backbone flexibility and water-mediated interactions.
- Published
- 2015
- Full Text
- View/download PDF
13. CASP11 refinement experiments with ROSETTA
- Author
-
Frank DiMaio, David Baker, and Hahnbeom Park
- Subjects
0301 basic medicine ,Protocol (science) ,Structure (mathematical logic) ,Computer science ,business.industry ,Model selection ,Machine learning ,computer.software_genre ,Biochemistry ,Multi-objective optimization ,03 medical and health sciences ,Range (mathematics) ,030104 developmental biology ,Software ,Development (topology) ,Structural Biology ,Data mining ,Artificial intelligence ,business ,Molecular Biology ,computer ,Native structure - Abstract
We report new Rosetta-based approaches to tackling the major issues that confound protein structure refinement, and the testing of these approaches in the CASP11 experiment. Automated refinement protocols were developed that integrate a range of sampling methods using parallel computation and multiobjective optimization. In CASP11, we used a more aggressive large-scale structure rebuilding approach for poor starting models, and a less aggressive local rebuilding plus core refinement approach for starting models likely to be closer to the native structure. The more incorrectly modeled a structure was predicted to be, the more it was allowed to vary during refinement. The CASP11 experiment revealed strengths and weaknesses of the approaches: the high-resolution strategy incorporating local rebuilding with core refinement consistently improved starting structures, while the low-resolution strategy incorporating the reconstruction of large parts of the structures improved starting models in some cases but often considerably worsened them, largely because of model selection issues. Overall, the results suggest the high-resolution refinement protocol is a promising method orthogonal to other approaches, while the low-resolution refinement method clearly requires further development. Proteins 2016; 84(Suppl 1):314-322. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
- Full Text
- View/download PDF
14. Protein structure prediction using Rosetta in CASP12
- Author
-
David E. Kim, Frank DiMaio, David Baker, Sergey Ovchinnikov, and Hahnbeom Park
- Subjects
0301 basic medicine ,Models, Molecular ,Protein Folding ,Computer science ,Protein Conformation ,Ab initio prediction ,Crystallography, X-Ray ,01 natural sciences ,Biochemistry ,Article ,03 medical and health sciences ,Protein structure ,Structural Biology ,Sequence Analysis, Protein ,0103 physical sciences ,Humans ,Molecular Biology ,Native structure ,Simulation ,010304 chemical physics ,Computational Biology ,Proteins ,Protein structure prediction ,030104 developmental biology ,Algorithm ,Algorithms ,Protein Structure Initiative - Abstract
We describe several notable aspects of our structure predictions using Rosetta in CASP12 in the free modeling (FM) and refinement (TR) categories. First, we had previously generated (and published) models for most large protein families lacking experimentally determined structures using Rosetta guided by co-evolution based contact predictions, and for several targets these models proved better starting points for comparative modeling than any known crystal structure-our model database thus starts to fulfill one of the goals of the original protein structure initiative. Second, while our "human" group simply submitted ROBETTA models for most targets, for six targets expert intervention improved predictions considerably; the largest improvement was for T0886 where we correctly parsed two discontinuous domains guided by predicted contact maps to accurately identify a structural homolog of the same fold. Third, Rosetta all atom refinement followed by MD simulations led to consistent but small improvements when starting models were close to the native structure, and larger but less consistent improvements when starting models were further away.
- Published
- 2017
15. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design
- Author
-
Maxim V. Shapovalov, Matthew J. O’Meara, Vikram Khipple Mulligan, Frank DiMaio, Hahnbeom Park, Jeffrey J. Gray, Andrew Leaver-Fay, Richard Bonneau, Michael S. Pacella, David Baker, Rhiju Das, Kalli Kappel, Jason W. Labonte, Tanja Kortemme, Rebecca F. Alford, Brian Kuhlman, Roland L. Dunbrack, Philip Bradley, Jeliazko R. Jeliazkov, and P. Douglas Renfrew
- Subjects
0301 basic medicine ,Molecular model ,Computer science ,Macromolecular Substances ,Protein Conformation ,media_common.quotation_subject ,Static Electricity ,Nanotechnology ,Crystal structure ,Computational biology ,Molecular Dynamics Simulation ,010402 general chemistry ,01 natural sciences ,Force field (chemistry) ,Article ,03 medical and health sciences ,Molecular dynamics ,Protein structure ,HIV Protease ,Physical and Theoretical Chemistry ,Function (engineering) ,media_common ,chemistry.chemical_classification ,Physics ,Protein therapeutics ,Biomolecule ,computer.file_format ,Small molecule ,Amino acid ,0104 chemical sciences ,Computer Science Applications ,030104 developmental biology ,Membrane protein ,chemistry ,Atom (standard) ,Mutation ,Nucleic acid ,Thermodynamics ,Modeling and design ,computer ,Energy (signal processing) ,Macromolecule - Abstract
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta’s success is the energy function: amodel parameterized from small molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, beta_nov15. Applying these concepts,we explain how to use Rosetta energies to identify and analyze the features of biomolecular models.Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membrane proteins, peptides containing non-canonical amino acids, carbohydrates, nucleic acids, and other macromolecules.
- Published
- 2017
- Full Text
- View/download PDF
16. Author response: Large-scale determination of previously unsolved protein structures using evolutionary information
- Author
-
Jimin Pei, Nick V. Grishin, Hetunandan Kamisetty, Lisa N. Kinch, David Baker, David E. Kim, Hahnbeom Park, Sergey Ovchinnikov, and Yuxing Liao
- Subjects
Scale (ratio) ,Computer science ,Evolutionary information ,Data mining ,computer.software_genre ,computer - Published
- 2015
- Full Text
- View/download PDF
17. Sampling of GPCR Second Extracellular Loops using Geometric Constraints
- Author
-
Hahnbeom Park and Chaok Seok
- Subjects
Transmembrane domain ,Loop closure ,Docking (molecular) ,Structural similarity ,Computer science ,Disulfide bond ,Biophysics ,Bioinformatics ,Global optimization ,Algorithm ,Large size ,G protein-coupled receptor - Abstract
Second extracellular loops (ECL2) of G protein-coupled receptors (GPCR) are known to play important roles by accommodating various GPCR ligands and providing ligand specificity. Despite the structural similarity among GPCR proteins, ECL2 structure is particularly hard to predict because of the relatively large size and ill-conserved sequence. In this study, we developed an efficient sampling algorithm for GPCR ECL2 that utilizes geometric constraints specific for GPCR. Two applications of the triaxial loop closure algorithm were employed to sample geometrically plausible ECL2 conformations that form a well-conserved disulfide bond with a particular transmembrane helix. Scores based on geometric constraints that effectively describe ECL2 environment were introduced to facilitate filtering of implausible ECL2 structures. All of these components are purely geometric, hence sampling and filtering can be performed with extremely low computational cost. A benchmark test was performed on seven unique GPCRs for which all-atom structures have been revealed. The result shows that the best model out of 50 sampled structures is of acceptable accuracy with the median loop RMSD less than 5 A. Combined with energy-guided global optimization, further refined ECL2 structures could be obtained. New ideas introduced in this study may be useful for developing methodologies for further GPCR modeling and docking studies.
- Published
- 2012
- Full Text
- View/download PDF
18. Refinement of unreliable local regions in template-based protein models
- Author
-
Hahnbeom Park and Chaok Seok
- Subjects
Models, Molecular ,Sequence ,Model refinement ,Computer science ,Protein Conformation ,Computational Biology ,Proteins ,Protein structure prediction ,Models, Theoretical ,computer.software_genre ,Biochemistry ,Structural Biology ,Modelling methods ,Protein model ,Humans ,Critical assessment ,Loop modeling ,Data mining ,Template based ,Molecular Biology ,Algorithm ,computer ,Algorithms ,Software - Abstract
Contemporary template-based modeling techniques allow applications of modeling methods to vast biological problems. However, they tend to fail to provide accurate structures for less-conserved local regions in sequence even when the overall structure can be modeled reliably. We call these regions unreliable local regions (ULRs). Accurate modeling of ULRs is of enormous value because they are frequently involved in functional specificity. In this article, we introduce a new method for modeling ULRs in template-based models by employing a sophisticated loop modeling technique. Combined with our previous study on protein termini, the method is applicable to refinement of both loop and terminus ULRs. A large-scale test carried out in a blind fashion in CASP9 (the 9th Critical Assessment of techniques for protein structure prediction) shows that ULR structures are improved over initial template-based models by refinement in more than 70% of the successfully detected ULRs. It is also notable that successful modeling of several long ULRs over 12 residues is achieved. Overall, the current results show that a careful application of loop and terminus modeling can be a promising tool for model refinement in template-based modeling.
- Published
- 2011
19. Community-Wide Assessment of Protein-Interface Modeling Suggests Improvements to Design Methodology
- Author
-
Libin Cao, Anne Poupon, Brian G. Pierce, Howook Hwang, Ying Chen, Victor L. Hsu, Hasup Lee, Yangyu Huang, Daisuke Kihara, Juan Fernández-Recio, Vladimir Potapov, Aroop Sircar, Chaok Seok, Timothy A. Whitehead, Jérôme Azé, Nir Ben Tal, Seren Soner, Brian Kuhlman, P. Benjamin Stranges, Nobuyuki Uchikoga, Sanbo Qin, Xinqi Gong, Yi Xiao, Carlos J. Camacho, Yaoqi Zhou, Gideon Schreiber, Ora Schueler-Furman, Paul A. Bates, Krishna Praneeth Kilambi, Joël Janin, Mati Cohen, Julie C. Mitchell, Panwen Wang, Cunxin Wang, Raed Khashan, Mayuko Takeda-Shitaka, Lin Li, Martin Zacharias, Alexander Tropsha, Genki Terashi, Xiaofan Li, David Baker, Jian Zhan, Julie Bernauer, Zohar Itzhaki, Mainak Guharoy, Eva-Maria Strauch, Xiaoqin Zou, Thom Vreven, Hahnbeom Park, Sheng-You Huang, Stephen Bush, Daron M. Standley, Feng Yang, Yuko Tsuchiya, Fan Jiang, Jacob E. Corn, Takashi Ishida, Chunhua Li, Junsu Ko, Robert G. Hall, Thomas Bourquard, Iain H. Moal, Weiyi Zhang, C.M. Driggers, Nir London, Jessica L. Morgan, Ron Jacak, Haruki Nakamura, Laura Pérez-Cano, Denis Fouches, Bin Liu, Yutaka Akiyama, Omar N. A. Demerdash, Yuval Inbar, Xianjin Xu, Yuedong Yang, Dachuan Guo, Masahito Ohue, Turkan Haliloglu, Jeffrey J. Gray, Juan Esquivel-Rodríguez, Alexandre M. J. J. Bonvin, Pemra Ozbek, Sarel J. Fleishman, Şefik Kerem Ovali, Charles H. Robert, Huan-Xiang Zhou, Eiji Kanamori, Yuri Matsuzaki, Carles Pons, Zhiping Weng, Kengo Kinoshita, Shoshana J. Wodak, Shiyong Liu, Panagiotis L. Kastritis, University of Washington [Seattle], Institute of Molecular Biophysics [Tallahassee], Florida State University [Tallahassee] (FSU), University of Wisconsin Whitewater, Kitasato University, Biomolecular Modelling laboratory [London], Cancer Research UK London Research Institute, Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM), Seoul National University [Seoul] (SNU), Knowledge representation, reasonning (ORPAILLEUR), INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS), Algorithms and Models for Integrative Biology (AMIB ), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Physiologie de la reproduction et des comportements [Nouzilly] (PRC), Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur]-Université de Tours (UT)-Centre National de la Recherche Scientifique (CNRS), Department of Chemical Engineering [Bogazici] (ChE), Boǧaziçi üniversitesi = Boğaziçi University [Istanbul], Tel Aviv University [Tel Aviv], University of Massachusetts Medical School [Worcester] (UMASS), University of Massachusetts System (UMASS), Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Chinese Academy of Sciences [Beijing] (CAS), Beijing University of Technology, Laboratoire de biochimie théorique [Paris] (LBT (UPR_9080)), Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS)-Institut de biologie physico-chimique (IBPC (FR_550)), Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC), Huazhong University of Science and Technology [Wuhan] (HUST), Hadassah Hebrew University Medical Center [Jerusalem], Weizmann Institute of Science [Rehovot, Israël], Institute for Protein Research [Osaka], Osaka University [Osaka], Japan Biological Informatics Consortium [Tokyo], WPI Immunology Frontier Research Center (IFREC), Graduate School of Information Sciences [Sendai], Tohoku University [Sendai], Oregon State University (OSU), Indiana University - Purdue University Indianapolis (IUPUI), Indiana University System, Bijvoet Center for Biomolecular Research [Utrecht], Utrecht University [Utrecht], University of Pittsburgh (PITT), Pennsylvania Commonwealth System of Higher Education (PCSHE), Johns Hopkins University (JHU), Tokyo Institute of Technology [Tokyo] (TITECH), University of North Carolina [Chapel Hill] (UNC), University of North Carolina System (UNC), Purdue University [West Lafayette], Dalton Cardiovascular Research Center [Columbia], University of Missouri [Columbia] (Mizzou), University of Missouri System-University of Missouri System, The Hospital for sick children [Toronto] (SickKids), Institut de biochimie et biophysique moléculaire et cellulaire (IBBMC), Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS), The authors thank Sameer Velankar and Marc Lensink for their help in coordinating this experiment and Raik Grunberg for many helpful suggestions on a draft. S.J.F. was supported by a long-term fellowship from the Human Frontier Science Program. S.J.W. is Canada Research Chair Tier 1, funded by the Canadian Institutes of Health Research. Research in the Baker laboratory was supported by the Howard Hughes Medical Institute, the Defense Advanced Research Projects Agency, the National Institutes of Health Yeast Resource Center, and the Defense Threat Reduction Agency., Technical University of Munich (TUM), Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Laboratoire de Recherche en Informatique (LRI), Centre National de la Recherche Scientifique (CNRS)-Université de Tours-Institut Français du Cheval et de l'Equitation [Saumur]-Institut National de la Recherche Agronomique (INRA), Boğaziçi University [Istanbul], Centre National de la Recherche Scientifique (CNRS)-Institut de biologie physico-chimique (IBPC), Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7), Weizmann Institute of Science, University of Missouri [Columbia], Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur] (IFCE)-Université de Tours (UT)-Centre National de la Recherche Scientifique (CNRS), Tel Aviv University (TAU), Institut de biologie physico-chimique (IBPC (FR_550)), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS), Institut National de la Recherche Agronomique (INRA)-Institut Français du Cheval et de l'Equitation [Saumur]-Université de Tours-Centre National de la Recherche Scientifique (CNRS), Université Paris Diderot - Paris 7 (UPD7)-Institut de biologie physico-chimique (IBPC), and Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Models, Molecular ,biochemistry and molecular biology ,Computer science ,Protein design ,Nanotechnology ,Machine learning ,computer.software_genre ,Article ,03 medical and health sciences ,Structural Biology ,protein protein interactions ,Taverne ,conformational plasticity ,Computational design ,Macromolecular docking ,CASP ,Design methods ,Molecular Biology ,030304 developmental biology ,Protein interface ,0303 health sciences ,Binding Sites ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Structural Biology [q-bio.BM] ,business.industry ,030302 biochemistry & molecular biology ,Proteins ,[SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Biomolecules [q-bio.BM] ,negative design ,Docking (molecular) ,computational protein design ,Critical assessment ,Artificial intelligence ,business ,computer ,Protein Binding - Abstract
International audience; The CAPRI (Critical Assessment of Predicted Interactions) and CASP (Critical Assessment of protein Structure Prediction) experiments have demonstrated the power of community-wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community-wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting that there may be important physical chemistry missing in the energy calculations. A total of 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the nonpolar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were, on average, structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a nonbinder.
- Published
- 2011
- Full Text
- View/download PDF
20. Template-Based Protein Modeling using Global and Local Templates
- Author
-
Junsu Ko, Jooyoung Lee, Chaok Seok, and Hahnbeom Park
- Subjects
Scheme (programming language) ,Multiple sequence alignment ,Computer science ,business.industry ,Biophysics ,3d model ,Nanotechnology ,Pattern recognition ,Protein structure prediction ,Sequence search ,Template ,Template based ,Artificial intelligence ,business ,computer ,computer.programming_language ,Sequence (medicine) - Abstract
For successful template-based protein modeling, it is important to identify relevant template proteins to the target sequence and then to generate proper multiple sequence alignment (MSA) between the target and the templates. However, in many cases, the templates obtained by global sequence search do not provide relevant structural information for local regions represented by gaps in the MSA. We have developed a method to improve the modeling accuracy of such regions by detecting unreliable local regions and utilizing local templates that can provide more reliable structural information for those regions. Our approach takes the following steps. First, a new scoring scheme that utilizes a modified information score is employed to detect unreliable local regions. Second, local templates that are aligned to the local regions more reliably are identified. Finally, the local templates are combined with the global templates to produce better 3D models. With newly obtained MSA containing global as well as local templates, protein 3D models are generate by a recently proposed model-building technique, MODELLER-CSA.
- Published
- 2010
- Full Text
- View/download PDF
21. GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions
- Author
-
Hahnbeom Park, Chaok Seok, and Junsu Ko
- Subjects
Models, Molecular ,Computer science ,Ab initio ,lcsh:Computer applications to medicine. Medical informatics ,Model refinement ,Machine learning ,computer.software_genre ,Biochemistry ,Loop modeling ,Software ,Protein structure ,Structural Biology ,lcsh:QH301-705.5 ,Molecular Biology ,Multiple sequence alignment ,business.industry ,Applied Mathematics ,Proteins ,Protein structure prediction ,Protein superfamily ,Computer Science Applications ,Variable (computer science) ,lcsh:Biology (General) ,Structural Homology, Protein ,lcsh:R858-859.7 ,Terminus modeling ,Artificial intelligence ,DNA microarray ,business ,Sequence Alignment ,computer ,Algorithm ,Research Article - Abstract
Background Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. Results We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on “Seok-server,” which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Conclusion Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.