31 results on '"InSuk Joung"'
Search Results
2. Non-sequential protein structure alignment by conformational space annealing and local refinement.
- Author
-
InSuk Joung, Jong Yun Kim, Keehyoung Joo, and Jooyoung Lee
- Subjects
Medicine ,Science - Abstract
Protein structure alignment is an important tool for studying evolutionary biology and protein modeling. A tool which intensively searches for the globally optimal non-sequential alignments is rarely found. We propose ALIGN-CSA which shows improvement in scores, such as DALI-score, SP-score, SO-score and TM-score over the benchmark set including 286 cases. We performed benchmarking of existing popular alignment scoring functions, where the dependence of the search algorithm was effectively eliminated by using ALIGN-CSA. For the benchmarking, we set the minimum block size to 4 to prevent much fragmented alignments where the biological relevance of small alignment blocks is hard to interpret. With this condition, globally optimal alignments were searched by ALIGN-CSA using the four scoring functions listed above, and TM-score is found to be the most effective in generating alignments with longer match lengths and smaller RMSD values. However, DALI-score is the most effective in generating alignments similar to the manually curated reference alignments, which implies that DALI-score is more biologically relevant score. Due to the high demand on computational resources of ALIGN-CSA, we also propose a relatively fast local refinement method, which can control the minimum block size and whether to allow the reverse alignment. ALIGN-CSA can be used to obtain much improved alignment at the cost of relatively more extensive computation. For faster alignment, we propose a refinement protocol that improves the score of a given alignment obtained by various external tools. All programs are available from http://lee.kias.re.kr.
- Published
- 2019
- Full Text
- View/download PDF
3. AN EXAMPLE OF TEMPLATE BASED PROTEIN STRUCTURE MODELING BY GLOBAL OPTIMIZATION
- Author
-
KEEHYOUNG JOO, INSUK JOUNG, and JOOYOUNG LEE
- Subjects
template based modeling ,protein structure modeling ,global optimization ,casp ,homology modeling ,sequence alignment ,Information technology ,T58.5-58.64 - Abstract
CASP (Critical Assessment of protein Structure Prediction) is a community-wide experiment for protein structure prediction taking place every two years since 1994. In CASP11 held in 2014, according to the official CASP11 assessment, our method named `nns' was ranked as the second best server method based on models ranked as first out of 81 targets. In `nns', we applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For the fold recognition, a new alignment method called CRFalign was used. The good performance of the nns server method is attributed to the successful fold recognition carried out by combined methods including CRFalign, and the current modeling formulation incorporating accurate structural aspects collected from multiple templates. In this article, we provide a successful example of `nns' predictions for T0776, for which all details of intermediate modeling data are provided.
- Published
- 2016
- Full Text
- View/download PDF
4. Structure-based protein folding type classification and folding rate prediction.
- Author
-
Balachandran Manavalan, Kunihiro Kuwajima, InSuk Joung, and Jooyoung Lee 0002
- Published
- 2015
- Full Text
- View/download PDF
5. Industrializing AI/ML during the end-to-end drug discovery process
- Author
-
Jiho Yoo, Tae Yong Kim, InSuk Joung, and Sang Ok Song
- Subjects
Structural Biology ,Molecular Biology - Published
- 2023
6. Exploring the Folding Mechanism of Small Proteins GB1 and LB1
- Author
-
InSuk Joung, Qianyi Cheng, Jooyoung Lee, Kunihiro Kuwajima, and Juyong Lee
- Subjects
Protein Folding ,010304 chemical physics ,biology ,Protein Conformation ,Chemistry ,Kinetics ,Proteins ,Molecular Dynamics Simulation ,01 natural sciences ,Transition state ,Computer Science Applications ,Folding (chemistry) ,Protein L ,Molecular dynamics ,0103 physical sciences ,biology.protein ,Biophysics ,Intermediate state ,Protein folding ,Protein G ,Physical and Theoretical Chemistry - Abstract
The computational atomistic description of the folding reactions of the B1 domains, GB1 and LB1, of protein G and protein L, respectively, is an important challenge in current protein folding studies. Although the two proteins have overall very similar backbone structures (β-hairpin-α-helix-β-hairpin), their apparent folding behaviors observed experimentally were remarkably different. LB1 folds in a two-state manner with the single-exponential kinetics, whereas GB1 folds in a more complex manner with an early stage intermediate that may exist on the folding pathway. Here, we used a new method of all-atom molecular dynamics simulations to investigate the folding mechanisms of GB1 and LB1. With the Lorentzian energy term derived from the native structure, we successfully observed frequent folding and unfolding events in the simulations at a high temperature (414 K for GB1 or 393 K for LB1) for both the proteins. Three and two transition-state structures were predicted for the GB1 and LB1 folding, respectively, at the high temperature. Two of the three transition-state structures of GB1 have a better formed second β-hairpin. One of the LB1 transition states has a better formed first hairpin, and the other has both hairpins equally formed. The structural features of these transition states are in good agreement with experimental transition-state analysis. At 300 K, more complex folding processes were observed in the simulations for both the proteins. Several intermediate structures were predicted for the two proteins, which led to the conclusion that both the proteins folded through similar mechanisms. However, the intermediate state accumulated in a sufficient amount only in the GB1 folding, which led to the double-exponential feature of its folding kinetics. On the other hand, the LB1 folding kinetics were well fitted by a single-exponential function. These results are fully consistent with those previously observed experimentally.
- Published
- 2019
7. CHARMM-GUIGlycan Modelerfor modeling and simulation of carbohydrates and glycoconjugates
- Author
-
InSuk Joung, Nathan R. Kern, Jooyoung Lee, Hui Sun Lee, Sunhwan Jo, Sang-Jun Park, Jumin Lee, Wonpil Im, Yifei Qi, and Keehyung Joo
- Subjects
Glycan ,Databases, Factual ,Glycoconjugate ,Computer science ,In silico ,Carbohydrates ,Protein Data Bank (RCSB PDB) ,Context (language use) ,Molecular simulation ,Computational biology ,Biochemistry ,Regular Manuscripts ,Modeling and simulation ,03 medical and health sciences ,N-linked glycosylation ,Polysaccharides ,Carbohydrate Conformation ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,biology ,030302 biochemistry & molecular biology ,Computational Biology ,carbohydrates (lipids) ,chemistry ,biology.protein ,Glycoconjugates - Abstract
Characterizing glycans and glycoconjugates in the context of three-dimensional structures is important in understanding their biological roles and developing efficient therapeutic agents. Computational modeling and molecular simulation have become an essential tool complementary to experimental methods. Here, we present a computational tool, Glycan Modeler for in silico N-/O-glycosylation of the target protein and generation of carbohydrate-only systems. In our previous study, we developed Glycan Reader, a web-based tool for detecting carbohydrate molecules from a PDB structure and generation of simulation system and input files. As integrated into Glycan Reader in CHARMM-GUI, Glycan Modeler (Glycan Reader & Modeler) enables to generate the structures of glycans and glycoconjugates for given glycan sequences and glycosylation sites using PDB glycan template structures from Glycan Fragment Database (http://glycanstructure.org/fragment-db). Our benchmark tests demonstrate the universal applicability of Glycan Reader & Modeler to various glycan sequences and target proteins. We also investigated the structural properties of modeled glycan structures by running 2-μs molecular dynamics simulations of HIV envelope protein. The simulations show that the modeled glycan structures built by Glycan Reader & Modeler have the similar structural features compared to the ones solved by X-ray crystallography. We also describe the representative examples of glycoconjugate modeling with video demos to illustrate the practical applications of Glycan Reader & Modeler. Glycan Reader & Modeler is freely available at http://charmm-gui.org/input/glycan.
- Published
- 2019
8. Conformational Space Annealing explained: A general optimization algorithm, with diverse applications
- Author
-
Jong Yun Kim, Steven P. Gross, Jooyoung Lee, InSuk Joung, and Keehyoung Joo
- Subjects
0301 basic medicine ,Continuous optimization ,Mathematical optimization ,Optimization problem ,General Physics and Astronomy ,010103 numerical & computational mathematics ,01 natural sciences ,03 medical and health sciences ,Vector optimization ,030104 developmental biology ,Hardware and Architecture ,Discrete optimization ,Test functions for optimization ,Random optimization ,Combinatorial optimization ,0101 mathematics ,Metaheuristic ,Algorithm ,Mathematics - Abstract
Many problems in science and engineering can be formulated as optimization problems. One way to solve these problems is to develop tailored problem-specific approaches. As such development is challenging, an alternative is to develop good generally-applicable algorithms. Such algorithms are easy to apply, typically function robustly, and reduce development time. Here we provide a description for one such algorithm called Conformational Space Annealing (CSA) along with its python version, PyCSA. We previously applied it to many optimization problems including protein structure prediction and graph community detection. To demonstrate its utility, we have applied PyCSA to two continuous test functions, namely Ackley and Eggholder functions. In addition, in order to provide complete generality of PyCSA to any types of an objective function, we demonstrate the way PyCSA can be applied to a discrete objective function, namely a parameter optimization problem. Based on the benchmarking results of the three problems, the performance of CSA is shown to be better than or similar to the most popular optimization method, simulated annealing. For continuous objective functions, we found that, L-BFGS-B was the best performing local optimization method, while for a discrete objective function Nelder–Mead was the best. The current version of PyCSA can be run in parallel at the coarse grained level by calculating multiple independent local optimizations separately. The source code of PyCSA is available from http://lee.kias.re.kr .
- Published
- 2018
9. A Simple and Efficient Protein Structure Refinement Method
- Author
-
InSuk Joung, Qianyi Cheng, and Jooyoung Lee
- Subjects
0301 basic medicine ,Protocol (science) ,Millisecond ,Protein Conformation ,Computer science ,Structure (category theory) ,Proteins ,Molecular Dynamics Simulation ,Protein structure prediction ,Energy minimization ,Computer Science Applications ,03 medical and health sciences ,030104 developmental biology ,Protein structure ,Simple (abstract algebra) ,Physical and Theoretical Chemistry ,CASP ,Algorithm ,Simulation - Abstract
Improving the quality of a given protein structure can serve as the ultimate solution for accurate protein structure prediction, and seeking such a method is currently a challenge in computational structural biology. In order to promote and encourage much needed such efforts, CASP (Critical Assessment of Structure Prediction) has been providing an ideal computational experimental platform, where it was reported only recently (since CASP10) that systematic protein structure refinement is possible by carrying out extensive (approximately millisecond) MD simulations with proper restraints generated from the given structure. Using an explicit solvent model and much reduced positional and distance restraints than previously exercised, we propose a refinement protocol that combines a series of short (5 ns) MD simulations with energy minimization procedures. Testing and benchmarking on 54 CASP8-10 refinement targets and 34 CASP11 refinement targets shows quite promising results. Using only a small fraction of MD simulation steps (nanosecond versus millisecond), systematic protein structure refinement was demonstrated in this work, indicating that refinement of a given model can be achieved using a few hours of desktop computing.
- Published
- 2017
10. Contact-assisted protein structure modeling by global optimization in CASP11
- Author
-
Keehyoung Joo, InSuk Joung, Sung Jong Lee, Qianyi Cheng, and Jooyoung Lee
- Subjects
0301 basic medicine ,Optimization problem ,030102 biochemistry & molecular biology ,business.industry ,Chemistry ,Cauchy distribution ,Nuclear Overhauser effect ,3D modeling ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,Protein structure ,Structural Biology ,Computational chemistry ,Protein folding ,business ,CASP ,Biological system ,Molecular Biology ,Global optimization - Abstract
We have applied the conformational space annealing method to the contact-assisted protein structure modeling in CASP11. For Tp targets, where predicted residue-residue contact information was provided, the contact energy term in the form of the Lorentzian function was implemented together with the physical energy terms used in our template-free modeling of proteins. Although we observed some structural improvement of Tp models over the models predicted without the Tp information, the improvement was not substantial on average. This is partly due to the inaccuracy of the provided contact information, where only about 18% of it was correct. For Ts targets, where the information of ambiguous NOE (Nuclear Overhauser Effect) restraints was provided, we formulated the modeling in terms of the two-tier optimization problem, which covers: (1) the assignment of NOE peaks and (2) the three-dimensional (3D) model generation based on the assigned NOEs. Although solving the problem in a direct manner appears to be intractable at first glance, we demonstrate through CASP11 that remarkably accurate protein 3D modeling is possible by brute force optimization of a relevant energy function. For 19 Ts targets of the average size of 224 residues, generated protein models were of about 3.6 A Cα atom accuracy. Even greater structural improvement was observed when additional Tc contact information was provided. For 20 out of the total 24 Tc targets, we were able to generate protein structures which were better than the best model from the rest of the CASP11 groups in terms of GDT-TS. Proteins 2016; 84(Suppl 1):189-199. © 2015 Wiley Periodicals, Inc.
- Published
- 2016
11. Template-free modeling by LEE and LEER in CASP11
- Author
-
Qianyi Cheng, InSuk Joung, Sung Jong Lee, Jooyoung Lee, Jong Yun Kim, Sun Young Lee, and Keehyoung Joo
- Subjects
0301 basic medicine ,Theoretical computer science ,business.industry ,Computer science ,Function (mathematics) ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,Software ,Template ,Structural Biology ,Cluster analysis ,business ,CASP ,Molecular Biology ,Protocol (object-oriented programming) ,Global optimization ,Algorithm ,Energy (signal processing) - Abstract
For the template-free modeling of human targets of CASP11, we utilized two of our modeling protocols, LEE and LEER. The LEE protocol took CASP11-released server models as the input and used some of them as templates for 3D (three-dimensional) modeling. The template selection procedure was based on the clustering of the server models aided by a community detection method of a server-model network. Restraining energy terms generated from the selected templates together with physical and statistical energy terms were used to build 3D models. Side-chains of the 3D models were rebuilt using target-specific consensus side-chain library along with the SCWRL4 rotamer library, which completed the LEE protocol. The first success factor of the LEE protocol was due to efficient server model screening. The average backbone accuracy of selected server models was similar to that of top 30% server models. The second factor was that a proper energy function along with our optimization method guided us, so that we successfully generated better quality models than the input template models. In 10 out of 24 cases, better backbone structures than the best of input template structures were generated. LEE models were further refined by performing restrained molecular dynamics simulations to generate LEER models. CASP11 results indicate that LEE models were better than the average template models in terms of both backbone structures and side-chain orientations. LEER models were of improved physical realism and stereo-chemistry compared to LEE models, and they were comparable to LEE models in the backbone accuracy. Proteins 2016; 84(Suppl 1):118-130. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
12. Template based protein structure modeling by global optimization in CASP11
- Author
-
Jong Young Joung, InSuk Joung, Juyong Lee, Sung Jong Lee, Seungryong Heo, Sun Young Lee, Jooyoung Lee, Qianyi Cheng, Jong Yun Kim, Keehyoung Joo, Balachandran Manavalan, Mikyung Nam, and In-Ho Lee
- Subjects
0301 basic medicine ,business.industry ,Computer science ,Protein structure prediction ,3D modeling ,Machine learning ,computer.software_genre ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,Software ,Template ,Structural Biology ,Homology modeling ,Artificial intelligence ,business ,CASP ,Cluster analysis ,Molecular Biology ,Algorithm ,Global optimization ,computer - Abstract
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
13. Data-assisted protein structure modeling by global optimization in CASP12
- Author
-
Keehyoung Joo, Seungryong Heo, InSuk Joung, Seung Hwan Hong, Sung Jong Lee, and Jooyoung Lee
- Subjects
0301 basic medicine ,Models, Molecular ,030102 biochemistry & molecular biology ,Protein Conformation ,Computational Biology ,Proteins ,Molecular Dynamics Simulation ,Biochemistry ,03 medical and health sciences ,030104 developmental biology ,X-Ray Diffraction ,Structural Biology ,Scattering, Small Angle ,Humans ,Databases, Protein ,Molecular Biology ,Algorithms - Abstract
In CASP12, 2 types of data-assisted protein structure modeling were experimented. Either SAXS experimental data or cross-linking experimental data was provided for a selected number of CASP12 targets that the CASP12 predictor could utilize for better protein structure modeling. We devised 2 separate energy terms for SAXS data and cross-linking data to drive the model structures into more native-like structures that satisfied the given experimental data as much as possible. In CASP11, we successfully performed protein structure modeling using simulated sparse and ambiguously assigned NOE data and/or correct residue-residue contact information, where the only energy term that folded the protein into its native structure was the term which was originated from the given experimental data. However, the 2 types of experimental data provided in CASP12 were far from being sufficient enough to fold the target protein into its native structure because SAXS data provides only the overall shape of the molecule and the cross-linking contact information provides only very low-resolution distance information. For this reason, we combined the SAXS or cross-linking energy term with our regular modeling energy function that includes both the template energy term and the de novo energy terms. By optimizing the newly formulated energy function, we obtained protein models that fit better with provided SAXS data than the X-ray structure of the target. However, the improvement of the model relative to the 1 modeled without the SAXS data, was not significant. Consistent structural improvement was achieved by incorporating cross-linking data into the protein structure modeling.
- Published
- 2017
14. Protein structure modeling and refinement by global optimization in CASP12
- Author
-
Balachandran Manavalan, Jose C. Flores-Canales, Mikyung Nam, Qianyi Cheng, InSuk Joung, Sung Jong Lee, Keehyoung Joo, Sun Young Lee, Jong Yun Kim, In-Ho Lee, Jooyoung Lee, Seungryong Heo, and Seung Hwan Hong
- Subjects
0301 basic medicine ,Models, Molecular ,Protein Folding ,Support Vector Machine ,Computer science ,Protein Conformation ,Molecular Dynamics Simulation ,Crystallography, X-Ray ,Biochemistry ,Force field (chemistry) ,Accessible surface area ,Machine Learning ,03 medical and health sciences ,Structural Biology ,Sequence Analysis, Protein ,Humans ,Protein Interaction Domains and Motifs ,CASP ,Molecular Biology ,Global optimization ,Models, Statistical ,Model selection ,Cauchy distribution ,Computational Biology ,Proteins ,Support vector machine ,030104 developmental biology ,Bounded function ,Algorithm ,Algorithms - Abstract
For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded.
- Published
- 2017
15. Finding multiple reaction pathways via global optimization of action
- Author
-
Jooyoung Lee, Bernard R. Brooks, InSuk Joung, Juyong Lee, and In-Ho Lee
- Subjects
Protein Folding ,Computational complexity theory ,Science ,Biophysics ,Molecular Conformation ,General Physics and Astronomy ,Small systems ,Molecular Dynamics Simulation ,01 natural sciences ,General Biochemistry, Genetics and Molecular Biology ,Article ,Molecular dynamics ,0103 physical sciences ,010306 general physics ,Langevin dynamics ,Global optimization ,Multidisciplinary ,Alanine ,010304 chemical physics ,Computational Biology ,General Chemistry ,Transition time ,Folding (DSP implementation) ,Action (physics) ,Models, Chemical ,Protein folding ,Biological system ,Peptides ,Algorithms ,Metabolic Networks and Pathways - Abstract
Global searching for reaction pathways is a long-standing challenge in computational chemistry and biology. Most existing approaches perform only local searches due to computational complexity. Here we present a computational approach, Action-CSA, to find multiple diverse reaction pathways connecting fixed initial and final states through global optimization of the Onsager–Machlup action using the conformational space annealing (CSA) method. Action-CSA successfully overcomes large energy barriers via crossovers and mutations of pathways and finds all possible pathways of small systems without initial guesses on pathways. The rank order and the transition time distribution of multiple pathways are in good agreement with those of long Langevin dynamics simulations. The lowest action folding pathway of FSD-1 is consistent with recent experiments. The results show that Action-CSA is an efficient and robust computational approach to study the multiple pathways of complex reactions and large-scale conformational changes., Identifying pathways and transition states is critical to understanding chemical and biological reactions. Here, the authors introduce a capable computational approach using conformational space annealing to find multiple reaction pathways via global optimization of the Onsager-Machlup action.
- Published
- 2016
16. A general method for the derivation of the functional forms of the effective energy terms in coarse-grained energy functions of polymers. III. Determination of scale-consistent backbone-local and correlation potentials in the UNRES force field and force-field calibration and validation
- Author
-
Adam Liwo, Stanisław Ołdziej, Adam K. Sieradzan, Agnieszka G. Lipska, Wioletta Żmudzińska, Anna Hałabis, Cezary Czaplewski, and InSuk Joung
- Subjects
Physics ,chemistry.chemical_classification ,010304 chemical physics ,biology ,Molecular biophysics ,Ab initio ,General Physics and Astronomy ,Thermodynamics ,Polymer ,010402 general chemistry ,01 natural sciences ,Force field (chemistry) ,0104 chemical sciences ,Molecular geometry ,Protein structure ,chemistry ,Ab initio quantum chemistry methods ,0103 physical sciences ,biology.protein ,Physical and Theoretical Chemistry ,Villin - Abstract
The general theory of the construction of scale-consistent energy terms in the coarse-grained force fields presented in Paper I of this series has been applied to the revision of the UNRES force field for physics-based simulations of proteins. The potentials of mean force corresponding to backbone-local and backbone-correlation energy terms were calculated from the ab initio energy surfaces of terminally blocked glycine, alanine, and proline, and the respective analytical expressions, derived by using the scale-consistent formalism, were fitted to them. The parameters of all these potentials depend on single-residue types, thus reducing their number and preventing over-fitting. The UNRES force field with the revised backbone-local and backbone-correlation terms was calibrated with a set of four small proteins with basic folds: tryptophan cage variant (TRP1; α), Full Sequence Design (FSD; α + β), villin headpiece (villin; α), and a truncated FBP-28 WW-domain variant (2MWD; β) (the NEWCT-4P force field) and, subsequently, with an enhanced set of 9 proteins composed of TRP1, FSD, villin, 1BDC (α), 2I18 (α), 1QHK (α + β), 2N9L (α + β), 1E0L (β), and 2LX7 (β) (the NEWCT-9P force field). The NEWCT-9P force field performed better than NEWCT-4P in a blind-prediction-like test with a set of 26 proteins not used in calibration and outperformed, in a test with 76 proteins, the most advanced OPT-WTFSA-2 version of UNRES with former backbone-local and backbone-correlation terms that contained more energy terms and more optimizable parameters. The NEWCT-9P force field reproduced the bimodal distribution of backbone-virtual-bond angles in the simulated structures, as observed in experimental protein structures.
- Published
- 2019
17. Finding dominant transition pathways via global optimization of action
- Author
-
Juyong Lee, InSuk Joung, Jooyong Lee, Bernard R. Brooks, and In-Ho Lee
- Subjects
0301 basic medicine ,Chemical Physics (physics.chem-ph) ,Computer science ,Crossover ,Biophysics ,FOS: Physical sciences ,Biomolecules (q-bio.BM) ,Transition time ,03 medical and health sciences ,Molecular dynamics ,030104 developmental biology ,Quantitative Biology - Biomolecules ,Computational chemistry ,Biological Physics (physics.bio-ph) ,Physics - Chemical Physics ,FOS: Biological sciences ,Physics - Biological Physics ,Biological system ,Global optimization - Abstract
We present a new computational approach, Action-CSA, to sample multiple reaction pathways with fixed initial and final states through global optimization of the Onsager-Machlup action using the conformational space annealing method. This approach successfully samples not only the most dominant pathway but also many other possible paths without initial guesses on reaction pathways. Pathway space is efficiently sampled by crossover operations of a set of paths and preserving the diversity of sampled pathways. The sampling ability of the approach is assessed by finding pathways for the conformational changes of alanine dipeptide and hexane. The benchmarks demonstrate that the rank order and the transition time distribution of multiple pathways identified by the new approach are in good agreement with those of long molecular dynamics simulations. We also show that the lowest action folding pathway of the mini-protein FSD-1 identified by the new approach is consistent with previous molecular dynamics simulations and experiments., Comment: 17 pages, 3 figures, and 2 tables
- Published
- 2016
- Full Text
- View/download PDF
18. Folding Mechanisms of Small Proteins GB1 and LB1
- Author
-
InSuk Joung, Qianyi Cheng, Jooyoung Lee, Keehyoung Joo, and Kunihiro Kuwajima
- Subjects
Similarity (geometry) ,biology ,Chemistry ,Biophysics ,Structure (category theory) ,Function (mathematics) ,Computational biology ,Folding (chemistry) ,Protein L ,Crystallography ,Simple (abstract algebra) ,Path (graph theory) ,biology.protein ,Protein G - Abstract
The B1 domains of protein G (GB1) and protein L (LB1) are two small proteins that binds to antibody immunoglobulin G (IgG). GB1 and LB1 are similar in size (about 60 residues), and also have an overall similar structure (β-hairpin--α-helix--β-hairpin). However their sequences are very different, possessing only 15% identity in a structure-based alignment. Therefore, there are interesting similarity and differences in their folding mechanisms. Experimental evidence indicated that LB1 folds in a two-state manner; while GB1 folds in a more complex way -- an early stage intermediate may exist in the folding path. Till now, the folding mechanisms are still under extensive experimental and computational study. Structure-based modeling is one of the less costly computational methods. It has a simple formulated potential energy function summing over various geometrical restraints from one or more targeted structures. Here, we used a new all-atom structure-based method to investigate the folding mechanisms of GB1 and LB1. In this approach, folded structures of the two proteins were used to construct the restraints and they are stabilized by Lorentzian attractive term instead of conventional harmonic term.3 Our model is able to identify two-state and non-two-state proteins, and gives us more insights of the their folding pathways.1. Scalley, M. L., Yi, Q., Gu, H., McCormack, A., Yates, 3rd, J. R., and Baker, D. Biochemistry 36: 3373-82, 1997.2. McCallister, E., Alm, E., and Baker, D. Nat. Struct. Biol. 7: 669-673, 2000.3. Lee, J., Joo, K., Brooks, B., and Lee, J. J. Chem. Theory Comput. 11: 3211-3224, 2015.
- Published
- 2017
19. Determination of Alkali and Halide Monovalent Ion Parameters for Use in Explicitly Solvated Biomolecular Simulations
- Author
-
InSuk Joung and Thomas E. Cheatham
- Subjects
Models, Molecular ,Ionic bonding ,Halide ,Biomolecular structure ,Alkalies ,010402 general chemistry ,Crystallography, X-Ray ,01 natural sciences ,Article ,Ion ,Halogens ,Computational chemistry ,0103 physical sciences ,Materials Chemistry ,Molecule ,Physical and Theoretical Chemistry ,chemistry.chemical_classification ,Quantitative Biology::Biomolecules ,010304 chemical physics ,Chemistry ,Biomolecule ,Solvation ,Water ,0104 chemical sciences ,Surfaces, Coatings and Films ,Folding (chemistry) ,Solvents ,Quantum Theory ,Thermodynamics - Abstract
Alkali (Li(+), Na(+), K(+), Rb(+), and Cs(+)) and halide (F(-), Cl(-), Br(-), and I(-)) ions play an important role in many biological phenomena, roles that range from stabilization of biomolecular structure, to influence on biomolecular dynamics, to key physiological influence on homeostasis and signaling. To properly model ionic interaction and stability in atomistic simulations of biomolecular structure, dynamics, folding, catalysis, and function, an accurate model or representation of the monovalent ions is critically necessary. A good model needs to simultaneously reproduce many properties of ions, including their structure, dynamics, solvation, and moreover both the interactions of these ions with each other in the crystal and in solution and the interactions of ions with other molecules. At present, the best force fields for biomolecules employ a simple additive, nonpolarizable, and pairwise potential for atomic interaction. In this work, we describe our efforts to build better models of the monovalent ions within the pairwise Coulombic and 6-12 Lennard-Jones framework, where the models are tuned to balance crystal and solution properties in Ewald simulations with specific choices of well-known water models. Although it has been clearly demonstrated that truly accurate treatments of ions will require inclusion of nonadditivity and polarizability (particularly with the anions) and ultimately even a quantum mechanical treatment, our goal was to simply push the limits of the additive treatments to see if a balanced model could be created. The applied methodology is general and can be extended to other ions and to polarizable force-field models. Our starting point centered on observations from long simulations of biomolecules in salt solution with the AMBER force fields where salt crystals formed well below their solubility limit. The likely cause of the artifact in the AMBER parameters relates to the naive mixing of the Smith and Dang chloride parameters with AMBER-adapted Aqvist cation parameters. To provide a more appropriate balance, we reoptimized the parameters of the Lennard-Jones potential for the ions and specific choices of water models. To validate and optimize the parameters, we calculated hydration free energies of the solvated ions and also lattice energies (LE) and lattice constants (LC) of alkali halide salt crystals. This is the first effort that systematically scans across the Lennard-Jones space (well depth and radius) while balancing ion properties like LE and LC across all pair combinations of the alkali ions and halide ions. The optimization across the entire monovalent series avoids systematic deviations. The ion parameters developed, optimized, and characterized were targeted for use with some of the most commonly used rigid and nonpolarizable water models, specifically TIP3P, TIP4P EW, and SPC/E. In addition to well reproducing the solution and crystal properties, the new ion parameters well reproduce binding energies of the ions to water and the radii of the first hydration shells.
- Published
- 2008
20. Structure-based protein folding type classification and folding rate prediction
- Author
-
Jooyoung Lee, Kunihiro Kuwajima, InSuk Joung, and Balachandran Manavalan
- Subjects
Support vector machine ,symbols.namesake ,Protein design ,symbols ,Protein folding ,Folding (DSP implementation) ,Protein structure prediction ,Bioinformatics ,Biological system ,Contact order ,Statistical potential ,Pearson product-moment correlation coefficient ,Mathematics - Abstract
Protein folding rate is one of the important properties of a protein. Protein folding rate prediction is useful for understanding protein folding process and guiding protein design. In this study, we developed a support vector machine (SVM) based method to predict protein folding kinetic types (two-state or non-two-state) and the real-value folding rate using the features calculated from the three-dimensional structure such as contact order, various properties from the non-local contact clusters, secondary structural information and sequence length. We systematically studied the contributions of individual features to folding rate prediction. Based on the highest contributions of individual features, we trained our machine using leave one out cross-validation and tested on a testing dataset. The Pearson correlation coefficient, mean absolute difference and root mean square error between the predicted and experimental folding rates (base-10 logarithmic scale) are 0.814, 0.752 and 0.910 for two-state proteins, and 0.860, 0.687 and 0.876 for non-two-state proteins. Moreover, our method predicts whether a protein of known atomic structure folds according to two-state or non-two-state kinetics and correctly classifies 80% of the folding mechanism on a testing dataset. Finally, we evaluated the performance of our method along with the other eight existing protein folding rate prediction tools on non-overlapping benchmarking dataset. The prediction performance will also be reported and discussed.
- Published
- 2015
21. Contact-assisted protein structure modeling by global optimization in CASP11
- Author
-
Keehyoung, Joo, InSuk, Joung, Qianyi, Cheng, Sung Jong, Lee, and Jooyoung, Lee
- Subjects
Models, Molecular ,Protein Conformation, alpha-Helical ,Internet ,Protein Folding ,Models, Statistical ,Amino Acid Motifs ,Computational Biology ,Proteins ,Thermodynamics ,Computer Simulation ,Protein Conformation, beta-Strand ,Protein Interaction Domains and Motifs ,Databases, Protein ,Nuclear Magnetic Resonance, Biomolecular ,Algorithms ,Software - Abstract
We have applied the conformational space annealing method to the contact-assisted protein structure modeling in CASP11. For Tp targets, where predicted residue-residue contact information was provided, the contact energy term in the form of the Lorentzian function was implemented together with the physical energy terms used in our template-free modeling of proteins. Although we observed some structural improvement of Tp models over the models predicted without the Tp information, the improvement was not substantial on average. This is partly due to the inaccuracy of the provided contact information, where only about 18% of it was correct. For Ts targets, where the information of ambiguous NOE (Nuclear Overhauser Effect) restraints was provided, we formulated the modeling in terms of the two-tier optimization problem, which covers: (1) the assignment of NOE peaks and (2) the three-dimensional (3D) model generation based on the assigned NOEs. Although solving the problem in a direct manner appears to be intractable at first glance, we demonstrate through CASP11 that remarkably accurate protein 3D modeling is possible by brute force optimization of a relevant energy function. For 19 Ts targets of the average size of 224 residues, generated protein models were of about 3.6 Å Cα atom accuracy. Even greater structural improvement was observed when additional Tc contact information was provided. For 20 out of the total 24 Tc targets, we were able to generate protein structures which were better than the best model from the rest of the CASP11 groups in terms of GDT-TS. Proteins 2016; 84(Suppl 1):189-199. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
22. Template-free modeling by LEE and LEER in CASP11
- Author
-
InSuk, Joung, Sun Young, Lee, Qianyi, Cheng, Jong Yun, Kim, Keehyoung, Joo, Sung Jong, Lee, and Jooyoung, Lee
- Subjects
Internet ,Protein Folding ,Models, Statistical ,Bacteria ,Amino Acid Motifs ,Computational Biology ,Proteins ,Stereoisomerism ,Molecular Dynamics Simulation ,Viruses ,Humans ,Thermodynamics ,Protein Interaction Domains and Motifs ,Databases, Protein ,Algorithms ,Software - Abstract
For the template-free modeling of human targets of CASP11, we utilized two of our modeling protocols, LEE and LEER. The LEE protocol took CASP11-released server models as the input and used some of them as templates for 3D (three-dimensional) modeling. The template selection procedure was based on the clustering of the server models aided by a community detection method of a server-model network. Restraining energy terms generated from the selected templates together with physical and statistical energy terms were used to build 3D models. Side-chains of the 3D models were rebuilt using target-specific consensus side-chain library along with the SCWRL4 rotamer library, which completed the LEE protocol. The first success factor of the LEE protocol was due to efficient server model screening. The average backbone accuracy of selected server models was similar to that of top 30% server models. The second factor was that a proper energy function along with our optimization method guided us, so that we successfully generated better quality models than the input template models. In 10 out of 24 cases, better backbone structures than the best of input template structures were generated. LEE models were further refined by performing restrained molecular dynamics simulations to generate LEER models. CASP11 results indicate that LEE models were better than the average template models in terms of both backbone structures and side-chain orientations. LEER models were of improved physical realism and stereo-chemistry compared to LEE models, and they were comparable to LEE models in the backbone accuracy. Proteins 2016; 84(Suppl 1):118-130. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
23. Template based protein structure modeling by global optimization in CASP11
- Author
-
Keehyoung, Joo, InSuk, Joung, Sun Young, Lee, Jong Yun, Kim, Qianyi, Cheng, Balachandran, Manavalan, Jong Young, Joung, Seungryong, Heo, Juyong, Lee, Mikyung, Nam, In-Ho, Lee, Sung Jong, Lee, and Jooyoung, Lee
- Subjects
Models, Molecular ,Internet ,Protein Folding ,Models, Statistical ,Computational Biology ,Proteins ,Protein Structure, Secondary ,Structural Homology, Protein ,Humans ,Thermodynamics ,Computer Simulation ,Protein Interaction Domains and Motifs ,Amino Acid Sequence ,Databases, Protein ,Sequence Alignment ,Algorithms ,Software - Abstract
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.
- Published
- 2015
24. Protein structure determination by conformational space annealing using NMR geometric restraints
- Author
-
Jooyoung Lee, Jinwoo Lee, Weontae Lee, Sung Jong Lee, Bernard R. Brooks, Jinhyuk Lee, Keehyoung Joo, and InSuk Joung
- Subjects
Models, Molecular ,0303 health sciences ,Chemistry ,Annealing (metallurgy) ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,Biophysics ,Proteins ,Biochemistry ,Crystallography ,03 medical and health sciences ,Protein structure ,0302 clinical medicine ,Structural Biology ,Molecule ,Databases, Protein ,Molecular Biology ,Global optimization ,Nuclear Magnetic Resonance, Biomolecular ,030217 neurology & neurosurgery ,Ramachandran plot ,030304 developmental biology - Abstract
We have carried out numerical experiments to investigate the applicability of the global optimization method of conformational space annealing (CSA) to the enhanced NMR protein structure determination over existing PDB structures. The NMR protein structure determination is driven by the optimization of collective multiple restraints arising from experimental data and the basic stereochemical properties of a protein-like molecule. By rigorous and straightforward application of CSA to the identical NMR experimental data used to generate existing PDB structures, we redetermined 56 recent PDB protein structures starting from fully randomized structures. The quality of CSA-generated structures and existing PDB structures were assessed by multiobjective functions in terms of their consistencies with experimental data and the requirements of protein-like stereochemistry. In 54 out of 56 cases, CSA-generated structures were better than existing PDB structures in the Pareto-dominant manner, while in the remaining two cases, it was a tie with mixed results. As a whole, all structural features tested improved in a statistically meaningful manner. The most improved feature was the Ramachandran favored portion of backbone torsion angles with about 8.6% improvement from 88.9% to 97.5% (P-value
- Published
- 2015
25. Sigma-RF: prediction of the variability of spatial restraints intemplate-based modeling by random forest
- Author
-
Juyong Lee, Keehyoung Joo, Bernard R. Brooks, InSuk Joung, Ki-Ho Lee, and Jooyoung Lee
- Subjects
Models, Molecular ,Template-based modeling ,Homology modeling ,Random forest ,Machinelearning ,Protein structure ,Protein structure prediction ,Proteinsequence ,Bioinformatics ,Statistics ,Computer science ,Gaussian ,Sequence alignment ,Machine learning ,computer.software_genre ,Biochemistry ,symbols.namesake ,Protein sequencing ,Artificial Intelligence ,Structural Biology ,Molecular Biology ,Models, Statistical ,business.industry ,Methodology Article ,Applied Mathematics ,Sigma ,MODELLER ,Computer Science Applications ,Structural Homology, Protein ,Benchmark (computing) ,symbols ,Artificial intelligence ,DNA microarray ,business ,Protein sequence ,Sequence Alignment ,computer ,Algorithm ,Algorithms - Abstract
Background In template-based modeling when using a single template, inter-atomic distances of an unknown protein structure are assumed to be distributed by Gaussian probability density functions, whose center peaks are located at the distances between corresponding atoms in the template structure. The width of the Gaussian distribution, the variability of a spatial restraint, is closely related to the reliability of the restraint information extracted from a template, and it should be accurately estimated for successful template-based protein structure modeling. Results To predict the variability of the spatial restraints in template-based modeling, we have devised a prediction model, Sigma-RF, by using the random forest (RF) algorithm. The benchmark results on 22 CASP9 targets show that the variability values from Sigma-RF are of higher correlations with the true distance deviation than those from Modeller. We assessed the effect of new sigma values by performing the single-domain homology modeling of 22 CASP9 targets and 24 CASP10 targets. For most of the targets tested, we could obtain more accurate 3D models from the identical alignments by using the Sigma-RF results than by using Modeller ones. Conclusions We find that the average alignment quality of residues located between and at two aligned residues, quasi-local information, is the most contributing factor, by investigating the importance of input features used in the RF machine learning. This average alignment quality is shown to be more important than the previously identified quantity of a local information: the product of alignment qualities at two aligned residues. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0526-z) contains supplementary material, which is available to authorized users.
- Published
- 2015
26. The Role of Mechanical Stresses in Angiogenesis
- Author
-
Matthew N. Iwamoto, James B. Hoying, InSuk Joung, Yan-Ting Shiu, Jeffrey A. Weiss, and Cole T. Quam
- Subjects
Angiogenic Process ,Angiogenesis ,Models, Cardiovascular ,Biomedical Engineering ,Endothelial Cells ,Neovascularization, Physiologic ,Blood Pressure ,Cell Differentiation ,Biology ,Mechanotransduction, Cellular ,Extracellular Matrix ,Extracellular matrix ,Intracellular signaling pathways ,Pathological Angiogenesis ,Immunology ,Animals ,Blood Vessels ,Humans ,Stress, Mechanical ,Mechanotransduction ,Neuroscience ,Blood Flow Velocity - Abstract
Angiogenesis is the formation of new capillary blood vessels from preexisting vessels. It is involved in many normal and diseased conditions, as well as in the application of tissue-engineered products. Th ere has been extensive eff ort made to develop strategies for controlling pathological angiogenesis and for promoting vascularization in biomedical engineering applications. Central to advancing these strategies is a mechanistic understanding of the angiogenic process. Angiogenesis is tightly regulated by local tissue environmental factors, including soluble molecules, extracellular matrices, cell-cell interactions, and diverse mechanical forces. Great advances have been made in identifying the biochemical factors and intracellular signaling pathways that mediate the control of angiogenesis. Th is review focuses on work that explores the biophysical aspect of angiogenesis regulation. Specifi cally, we discuss the role of cell-generated forces, counterforces from the extracellular matrix, and mechanical forces associated with blood fl ow and extravascular tissue activity in the regulation of angiogenesis. Because angiogenesis occurs in a mechanically dynamic environment, future investigations should aim at understanding how cells integrate chemical and mechanical signals so that a rational approach to controlling angiogenesis will become possible. In this regard, computational models that incorporate multiple epigenetic factors to predict capil- lary patterning will be useful.
- Published
- 2005
27. Integral Equation Theory of Biomolecules and Electrolytes
- Author
-
Tyler Luchko, David A. Case, and InSuk Joung
- Subjects
chemistry.chemical_classification ,Solvent ,Monovalent ions ,chemistry ,Chemical physics ,Biomolecule ,Inorganic chemistry ,Electrolyte ,Integral equation ,Ion - Abstract
The so-called three-dimensional version (3D-RISM) can be used to describe the interactions of solvent components (here we treat water and ions) with a chemical or biomolecular solute of arbitrary size and shape. Here we give an overview of the current status of such models, describing some aspects of “pure” electrolytes (water plus simple ions) and of ionophores, proteins and nucleic acids in the presence of water and salts. Here we focus primarily on interactions with water and dissolved salts; as a practical matter, the discussion is mostly limited to monovalent ions, since studies of divalent ions present many difficult problems that have not yet been addressed. This is not a comprehensive review, but covers a few recent examples that illustrate current issues.
- Published
- 2012
28. Cyclic strain affects the orientation of endothelial tubulogenesis in a frequency‐dependent manner
- Author
-
Matthew N. Iwamoto, InSuk Joung, Yan-Ting Shiu, Jake Jensen, and Vasiliy Chernyshev
- Subjects
Cyclic strain ,Materials science ,Dependent manner ,Genetics ,Biophysics ,Orientation (graph theory) ,Molecular Biology ,Biochemistry ,Biotechnology - Published
- 2006
29. Cyclic strain modulates tubulogenesis of endothelial cells in a 3D tissue culture model
- Author
-
InSuk Joung, Yan-Ting Shiu, Cole T. Quam, and Matthew N. Iwamoto
- Subjects
Cyclic strain ,Periodicity ,Angiogenesis ,Cell Culture Techniques ,Neovascularization, Physiologic ,Aorta, Thoracic ,Biology ,Biochemistry ,Mechanotransduction, Cellular ,Models, Biological ,Cell Line ,Tissue culture ,Vasculogenesis ,Imaging, Three-Dimensional ,Cell Movement ,Physical Stimulation ,von Willebrand Factor ,Animals ,Mechanotransduction ,Hemodynamic forces ,Cell Biology ,Blood flow ,Immunohistochemistry ,Actins ,Elasticity ,Cell biology ,Capillaries ,In vitro system ,Cattle ,Collagen ,Endothelium, Vascular ,Stress, Mechanical ,Cardiology and Cardiovascular Medicine ,Gels - Abstract
Angiogenesis is the formation of new blood vessels from preexisting capillaries or venules. It occurs in a mechanically dynamic environment due to blood flow, but the role of hemodynamic forces in angiogenesis remains poorly understood. We have developed a unique in vitro system for the investigation of angiogenesis under cyclic strain. In this system, tubulogenesis of vascular endothelial cells in 3D collagen gels occurs under well-defined cyclic strain, which mimics blood-pressure-induced stretch. Using this system, we demonstrate that cyclic strain results in alignment of endothelial-cord-like structures perpendicular to the principal axis of stretch. Such preferential orientation was the most evident in deep and long cord-like structures. This in vitro system, along with the novel findings of strain-modulated endothelial tube morphology, enables the formation of an experimental basis for understanding the role of cyclic strain in the regulation of angiogenesis.
- Published
- 2005
30. Simple electrolyte solutions: Comparison of DRISM and molecular dynamics results for alkali halide solutions
- Author
-
David A. Case, Tyler Luchko, and InSuk Joung
- Subjects
Models, Molecular ,Activity coefficient ,Aqueous solution ,Chemistry ,Solvation ,General Physics and Astronomy ,Thermodynamics ,Electrolyte ,Alkalies ,Molecular Dynamics Simulation ,Ion ,Enthalpy change of solution ,Solutions ,Electrolytes ,Molecular dynamics ,Theoretical Methods and Algorithms ,Physical chemistry ,Salts ,Physical and Theoretical Chemistry ,Series expansion - Abstract
Using the dielectrically consistent reference interaction site model (DRISM) of molecular solvation, we have calculated structural and thermodynamic information of alkali-halide salts in aqueous solution, as a function of salt concentration. The impact of varying the closure relation used with DRISM is investigated using the partial series expansion of order-n (PSE-n) family of closures, which includes the commonly used hypernetted-chain equation (HNC) and Kovalenko-Hirata closures. Results are compared to explicit molecular dynamics (MD) simulations, using the same force fields, and to experiment. The mean activity coefficients of ions predicted by DRISM agree well with experimental values at concentrations below 0.5 m, especially when using the HNC closure. As individual ion activities (and the corresponding solvation free energies) are not known from experiment, only DRISM and MD results are directly compared and found to have reasonably good agreement. The activity of water directly estimated from DRISM is nearly consistent with values derived from the DRISM ion activities and the Gibbs-Duhem equation, but the changes in the computed pressure as a function of salt concentration dominate these comparisons. Good agreement with experiment is obtained if these pressure changes are ignored. Radial distribution functions of NaCl solution at three concentrations were compared between DRISM and MD simulations. DRISM shows comparable water distribution around the cation, but water structures around the anion deviate from the MD results; this may also be related to the high pressure of the system. Despite some problems, DRISM-PSE-n is an effective tool for investigating thermodynamic properties of simple electrolytes.
- Published
- 2013
31. Sigma-RF: prediction of the variability of spatial restraints in template-based modeling by random forest.
- Author
-
Juyong Lee, Kiho Lee, InSuk Joung, Keehyoung Joo, Brooks, Bernard R., and Jooyoung Lee
- Subjects
SEQUENCE alignment ,RANDOM forest algorithms ,MACHINE learning ,PROTEIN structure ,AMINO acid sequence ,BIOINFORMATICS - Abstract
Background: In template-based modeling when using a single template, inter-atomic distances of an unknown protein structure are assumed to be distributed by Gaussian probability density functions, whose center peaks are located at the distances between corresponding atoms in the template structure. The width of the Gaussian distribution, the variability of a spatial restraint, is closely related to the reliability of the restraint information extracted from a template, and it should be accurately estimated for successful template-based protein structure modeling. Results: To predict the variability of the spatial restraints in template-based modeling, we have devised a prediction model, Sigma-RF, by using the random forest (RF) algorithm. The benchmark results on 22 CASP9 targets show that the variability values from Sigma-RF are of higher correlations with the true distance deviation than those from Modeller. We assessed the effect of new sigma values by performing the single-domain homology modeling of 22 CASP9 targets and 24 CASP10 targets. For most of the targets tested, we could obtain more accurate 3D models from the identical alignments by using the Sigma-RF results than by using Modeller ones. Conclusions: We find that the average alignment quality of residues located between and at two aligned residues, quasi-local information, is the most contributing factor, by investigating the importance of input features used in the RF machine learning. This average alignment quality is shown to be more important than the previously identified quantity of a local information: the product of alignment qualities at two aligned residues. Keywords: Template-based modeling, Homology modeling, Random forest, Machine learning, Protein structure, Protein structure prediction, Protein sequence, Bioinformatics, Statistics [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.