1. FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation
- Author
-
Yijie Ding, Quan Zou, Yi Zou, and Li Peng
- Subjects
Support Vector Machine ,Multiple kernel learning ,Correlation coefficient ,Computer science ,business.industry ,Value (computer science) ,Health Informatics ,Pattern recognition ,Fuzzy logic ,General Biochemistry, Genetics and Molecular Biology ,Computer Science Applications ,DNA-Binding Proteins ,Machine Learning ,Support vector machine ,Kernel (linear algebra) ,Artificial intelligence ,Noise (video) ,business ,Algorithms ,Membership function - Abstract
Due to the high cost of DNA-binding proteins (DBPs) detection, many machine learning algorithms (ML) have been utilized to large-scale process and detect DBPs. The previous methods took no count of the processing of noise samples. In this study, a fuzzy twin support vector machine (FTWSVM) is employed to detect DBPs. First, multiple types of protein sequence features are formed into kernel matrices; Then, multiple kernel learning (MKL) algorithm is utilized to linear combine multiple kernels; next, self-representation-based membership function is utilized to estimate membership value (weight) of each training sample; finally, we feed the integrated kernel matrix and membership values into the FTWSVM-SR model for training and testing. On comparison with other predictive models, FTWSVM based on SR (FTWSVM-SR) obtains the best performance of Matthew's correlation coefficient (MCC): 0.7410 and 0.5909 on two independent testing sets (PDB186 and PDB2272 datasets), respectively. The results confirm that our method can be an effective DBPs detection tool. Before the biochemical experiment, our model can screen and analyze DBPs on a large scale.
- Published
- 2021
- Full Text
- View/download PDF