1. Gene Classification Based on Multi-Class SVMs with Systematic Sampling and Hierarchical Clustering (SSHC) Algorithm
- Author
-
Nwayyin Najat, Mohammed
- Subjects
Machine Learning ,Support Vector Machine ,Cluster Analysis ,Gene Expression ,Humans ,Algorithms - Abstract
The support vector machines (SVMs) is one of the machine learning algorithms with high classification accuracy. However, the support vector machine algorithm has a very high training complexity. Thus, it is not very efficient with large datasets. In this study, we have used the multi-class support vector machines and systematic sampling with hierarchical clustering (SSHC-MCSVM) algorithm for gene expression data classification. The gene expression profiles are considered as large datasets. The gene expression datasets that are used in this study are two datasets for obese and lean individuals. In this proposed (SSHC-MCSVM) algorithm, the gene expression data are regrouped to new sets of genes based on systematic sampling with hierarchical clustering (SSHC) algorithm. The SSHC algorithm repeated n times and the k-partitions with clusters that have high adjusted Rand index (ARI) are chosen. The multi-class support vector machines are applied to the best regrouped gene expression data to classify the significant genes. The performance measures are accuracy, recall, and precision. The proposed algorithm which is SSHC-MCSVM could classify the significant genes with high accuracy, recall, and precision.
- Published
- 2022