Back to Search
Start Over
Prediction of secondary protein structure with binary coding patterns of amino acid and nucleotide physicochemical properties
- Source :
- International Journal of Quantum Chemistry. 92:123-134
- Publication Year :
- 2003
- Publisher :
- Wiley, 2003.
-
Abstract
- We present binary coding algorithm for the α- and β-protein fold prediction. The method links amino acid molecular polarity patterns and physicochemical properties of nucleotide bases coded by means of a binary addresses. Primary sequences that define secondary protein structure were analyzed with respect to the symbolic oligopeptides (SO) obtained by the reduction of the 20 amino acid letter alphabet into a binary alphabet of nonpolar group 0 (W, C, I, F, M, V, L, Y) and polar group 1 (Q, R, H, K, N, E, D, S, G, T, A, P). The groups were extracted from the Grantham polarity scale with the clustering around medoids procedure. The transformation of protein strings into binary coding patterns of the polar and nonpolar amino acid groups reduced analyzed elements within the protein motif of length n by the factor of 10n. SMO learning algorithm for the support vector machines was applied to classify α-helices and β-strands. It was shown that the relative frequencies of binary hexapeptides classify all 174 nonhomologous α- and β-protein folds from the Jpred database with 100% accuracy. The results of 10-fold cross-validation and leave-one-out test were 86.78%. Classification tree confirmed the results of SMO analysis and correctly classified 100% of the folds by means of 9 binary hexapeptides. Linear block triple-check code was proposed for the description of hexapeptide patterns. The presented method enables simple, quick, and accurate prediction of α- and β-protein folding types from the primary amino acid and nucleotide sequences on a personal computer. Our results imply that few amino acid polarity patterns specified by the nucleotide physicochemical properties describe basic protein folding types with >90% accuracy. © 2003 Wiley Periodicals, Inc. Int J Quantum Chem, 2003
- Subjects :
- chemistry.chemical_classification
Protein structure prediction
Condensed Matter Physics
Genetic code
Atomic and Molecular Physics, and Optics
Amino acid
Crystallography
chemistry
Personal computer
Protein folding
Binary code
Physical and Theoretical Chemistry
Structural motif
protein fold
secondary structure
prediction
error-correcting code
genetic code
nucleotides
amino acids
Protein secondary structure
Subjects
Details
- ISSN :
- 1097461X and 00207608
- Volume :
- 92
- Database :
- OpenAIRE
- Journal :
- International Journal of Quantum Chemistry
- Accession number :
- edsair.doi.dedup.....e5833ec381e7f4bbf46c4a92f92a7cc9
- Full Text :
- https://doi.org/10.1002/qua.10499