Back to Search Start Over

A Novel Method for Prediction of Protein Domain Using Distance-Based Maximal Entropy

Authors :
Yan Wang
Chunguang Zhou
Yanxin Huang
Shuxue Zou
Source :
Journal of Bionic Engineering. 5:215-223
Publication Year :
2008
Publisher :
Springer Science and Business Media LLC, 2008.

Abstract

Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbalanced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general imbalanced datasets.

Details

ISSN :
25432141 and 16726529
Volume :
5
Database :
OpenAIRE
Journal :
Journal of Bionic Engineering
Accession number :
edsair.doi...........778efa3a79e28cdd0735e90100c6b972
Full Text :
https://doi.org/10.1016/s1672-6529(08)60027-x