1. Use of a structural alphabet for analysis of short loops connecting repetitive structures
- Author
-
Fourrier, Laurent, Benros, Cristina, de Brevern, Alexandre G, Bioinformatique génomique et moléculaire, Institut National de la Santé et de la Recherche Médicale (INSERM), This work was supported by grants from the Ministère de la Recherche and from 'Action Bioinformatique inter EPST' 2001 – 2002 number 4B005F and 2003–2004. AdB was supported by a grant from the Fondation de la Recherche Médicale and is a full time researcher at the French Institute for Health and Medical Care (INSERM). CB has a grant from the Ministère de la Recherche., and de Brevern, Alexandre G.
- Subjects
Repetitive Sequences, Amino Acid ,Protein Conformation ,Molecular Sequence Data ,MESH: Protein Structure, Secondary ,lcsh:Computer applications to medicine. Medical informatics ,MESH: Research Support, Non-U.S. Gov't ,Protein Structure, Secondary ,MESH: Software ,MESH: Protein Structure, Tertiary ,MESH: Protein Conformation ,Predictive Value of Tests ,MESH: Proteins ,lcsh:QH301-705.5 ,[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,MESH: Molecular Sequence Data ,MESH: Repetitive Sequences, Amino Acid ,MESH: Peptides ,Computational Biology ,Proteins ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,MESH: Predictive Value of Tests ,Protein Structure, Tertiary ,lcsh:Biology (General) ,lcsh:R858-859.7 ,Peptides ,Software ,Research Article ,MESH: Computational Biology - Abstract
Background Because loops connect regular secondary structures, analysis of the former depends directly on the definition of the latter. The numerous assignment methods, however, can offer different definitions. In a previous study, we defined a structural alphabet composed of 16 average protein fragments, which we called Protein Blocks (PBs). They allow an accurate description of every region of 3D protein backbones and have been used in local structure prediction. In the present study, we use this structural alphabet to analyze and predict the loops connecting two repetitive structures. Results We first analyzed the secondary structure assignments. Use of five different assignment methods (DSSP, DEFINE, PCURVE, STRIDE and PSEA) showed the absence of consensus: 20% of the residues were assigned to different states. The discrepancies were particularly important at the extremities of the repetitive structures. We used PBs to describe and predict the short loops because they can help analyze and in part explain these discrepancies. An analysis of the PB distribution in these regions showed some specificities in the sequence-structure relationship. Of the amino acid over- or under-representations observed in the short loop databank, 20% did not appear in the entire databank. Finally, predicting 3D structure in terms of PBs with a Bayesian approach yielded an accuracy rate of 36.0% for all loops and 41.2% for the short loops. Specific learning in the short loops increased the latter by 1%. Conclusion This work highlights the difficulties of assigning repetitive structures and the advantages of using more precise descriptions, that is, PBs. We observed some new amino acid distributions in the short loops and used this information to enhance local prediction. Instead of describing entire loops, our approach predicts each position in the loops locally. It can thus be used to propose many different structures for the loops and to probe and sample their flexibility. It can be a useful tool in ab initio loop prediction.
- Published
- 2004