Back to Search
Start Over
Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments
- Source :
- PLoS Computational Biology, PLoS Computational Biology, Vol 4, Iss 5, p e1000083 (2008), Vrije Universiteit Brussel
- Publication Year :
- 2008
- Publisher :
- Public Library of Science (PLoS), 2008.
-
Abstract
- As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.<br />Author Summary Large-scale DNA sequencing efforts produce large amounts of protein sequence data. However, in order to understand the function of a protein, its tertiary three-dimensional structure is required. Despite worldwide efforts in structural biology, experimental protein structures are determined at a significantly slower pace. As a result, computational methods for protein structure prediction receive significant attention. A large part of the structure prediction problem lies in the enormous size of the problem: proteins seem to occur in an infinite variety of shapes. Here, we propose that this huge complexity may be overcome by identifying recurrent protein fragments, which are frequently reused as building blocks to construct proteins that were hitherto thought to be unrelated. The BriX database is the outcome of identifying about 2,000 canonical shapes among 1,261 protein structures. We show any given protein can be reconstructed from this library of building blocks at a very high resolution, suggesting that the modelling of protein backbones may be greatly aided by our database.
- Subjects :
- Models, Molecular
Protein Conformation
Molecular Sequence Data
Context (language use)
Biology
Molecular Biology/Bioinformatics
Cellular and Molecular Neuroscience
Protein structure
fragment library
Sequence Analysis, Protein
Genetics
Side chain
Computer Simulation
Amino Acid Sequence
Databases, Protein
Structural motif
lcsh:QH301-705.5
Molecular Biology
Conformational isomerism
Peptide sequence
Ecology, Evolution, Behavior and Systematics
Ecology
Proteins
Protein structure prediction
Peptide Fragments
Crystallography
lcsh:Biology (General)
Models, Chemical
Computational Theory and Mathematics
Biochemistry
Modeling and Simulation
Computational Biology/Sequence Motif Analysis
Alpha helix
Research Article
Subjects
Details
- ISSN :
- 15537358
- Volume :
- 4
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....d4f8d3900beb6a479d7c8f9dfc712321