1. A Fuzzy Twin Support Vector Machine Based on Dissimilarity Measure and Its Biomedical Applications
- Author
-
Qiu, Jianxiang, Xie, Jialiang, Zhang, Dongxiao, Zhang, Ruping, and Lin, Mingwei
- Abstract
Biomedical data exhibit high-dimensional complexity in its internal structure and are susceptible to noise interference, making classification tasks in biomedical data highly challenging. Twin support vector machine (TSVM) is a machine learning algorithm that can effectively solve pattern recognition problems. To mitigate the negative impact of noise, researchers have combined fuzzy set theory with TSVM and use fuzzy membership to describe the influence of different samples on constructing the optimal hyperplane, thus, extending TSVM to fuzzy twin support vector machines (FTSVM). In this paper, the dissimilarity measure based on data distribution is innovatively introduced into the fuzzy membership assignment process, and a novel fuzzy membership assignment strategy is designed to effectively reduce the negative impact of noise in biomedical data. Rather than rely on geometric distance, this strategy takes data distribution as the primary factor in measuring dissimilarity between samples and then constructs a heuristic function to assign fuzzy membership to different samples. Combining this strategy with TSVM, this paper proposed a fuzzy twin support vector machine based on dissimilarity measure (DFTSVM), which could effectively solve the classification problem with noise and shows excellent generalization performance in biomedical data. Moreover, DFTSVM employs a coordinate descent strategy with shrinking by active set to reduce computational complexity, which significantly improves the training speed of the model. Experiments are conducted on 14 biomedical datasets to compare the performance of DFTSVM with 10 heterogeneous machine learning classification algorithms and four homology algorithms. The results demonstrate that DFTSVM outperforms other algorithms in terms of classification performance on biomedical data. It exhibits excellent generalization performance in noisy environments, and its advantages in terms of generalization performance and noise robustness become more prominent as the noise rate increases.
- Published
- 2024
- Full Text
- View/download PDF