Back to Search
Start Over
Predicting disease-associated substitution of a single amino acid by analyzing residue interactions
- Source :
- BMC Bioinformatics, Vol 12, Iss 1, p 14 (2011), BMC Bioinformatics
- Publication Year :
- 2011
- Publisher :
- BMC, 2011.
-
Abstract
- Background The rapid accumulation of data on non-synonymous single nucleotide polymorphisms (nsSNPs, also called SAPs) should allow us to further our understanding of the underlying disease-associated mechanisms. Here, we use complex networks to study the role of an amino acid in both local and global structures and determine the extent to which disease-associated and polymorphic SAPs differ in terms of their interactions to other residues. Results We found that SAPs can be well characterized by network topological features. Mutations are probably disease-associated when they occur at a site with a high centrality value and/or high degree value in a protein structure network. We also discovered that study of the neighboring residues around a mutation site can help to determine whether the mutation is disease-related or not. We compiled a dataset from the Swiss-Prot variant pages and constructed a model to predict disease-associated SAPs based on the random forest algorithm. The values of total accuracy and MCC were 83.0% and 0.64, respectively, as determined by 5-fold cross-validation. With an independent dataset, our model achieved a total accuracy of 80.8% and MCC of 0.59, respectively. Conclusions The satisfactory performance suggests that network topological features can be used as quantification measures to determine the importance of a site on a protein, and this approach can complement existing methods for prediction of disease-associated SAPs. Moreover, the use of this method in SAP studies would help to determine the underlying linkage between SAPs and diseases through extensive investigation of mutual interactions between residues.
- Subjects :
- DNA Mutational Analysis
Single-nucleotide polymorphism
Biology
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Polymorphism, Single Nucleotide
Structural Biology
Sequence Analysis, Protein
Humans
Cluster coefficient
Single amino acid
Databases, Protein
Molecular Biology
lcsh:QH301-705.5
Genetic Association Studies
Genetics
chemistry.chemical_classification
Residue (complex analysis)
Models, Statistical
Applied Mathematics
Computational Biology
Proteins
Amino acid substitution
Amino acid
Computer Science Applications
chemistry
Amino Acid Substitution
lcsh:Biology (General)
Mutation
lcsh:R858-859.7
DNA microarray
Algorithms
Research Article
Subjects
Details
- Language :
- English
- ISSN :
- 14712105
- Volume :
- 12
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....f3291a7f406b92ba9e322504ee2b0760