1. LoCo: a novel main chain scoring function for protein structure prediction based on local coordinates
- Author
-
Ram Samudrala and Stewart E. Moughon
- Subjects
Protein Conformation ,Computer science ,Knowledge Bases ,Coordinate system ,Protein design ,lcsh:Computer applications to medicine. Medical informatics ,010402 general chemistry ,01 natural sciences ,Biochemistry ,03 medical and health sciences ,Protein structure ,Structural Biology ,Position (vector) ,Local coordinates ,Amino Acids ,lcsh:QH301-705.5 ,Molecular Biology ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,business.industry ,Applied Mathematics ,Rank (computer programming) ,Proteins ,Function (mathematics) ,Protein structure prediction ,0104 chemical sciences ,Computer Science Applications ,Amino acid ,Models, Chemical ,lcsh:Biology (General) ,chemistry ,Proteins metabolism ,lcsh:R858-859.7 ,Artificial intelligence ,DNA microarray ,business ,Decoy ,Algorithm ,Research Article - Abstract
Background Successful protein structure prediction requires accurate low-resolution scoring functions so that protein main chain conformations that are close to the native can be identified. Once that is accomplished, a more detailed and time-consuming treatment to produce all-atom models can be undertaken. The earliest low-resolution scoring used simple distance-based "contact potentials," but more recently, the relative orientations of interacting amino acids have been taken into account to improve performance. Results We developed a new knowledge-based scoring function, LoCo, that locates the interaction partners of each individual residue within a local coordinate system based only on the position of its main chain N, Cα and C atoms. LoCo was trained on a large set of experimentally determined structures and optimized using standard sets of modeled structures, or "decoys." No structure used to train or optimize the function was included among those used to test it. When tested against 29 other published main chain functions on a group of 77 commonly used decoy sets, our function outperformed all others in Cα RMSD rank of the best-scoring decoy, with statistically significant p-values < 0.05 for 26 out of the 29 other functions considered. LoCo is fast, requiring on average less than 6 microseconds per residue for interaction and scoring on commonly-used computer hardware. Conclusions Our function demonstrates an unmatched combination of accuracy, speed, and simplicity and shows excellent promise for protein structure prediction. Broader applications may include protein-protein interactions and protein design.
- Published
- 2011