Back to Search
Start Over
Enhancing Cancerous Gene Selection and Classification for High-Dimensional Microarray Data Using a Novel Hybrid Filter and Differential Evolutionary Feature Selection.
- Source :
-
Cancers . Dec2024, Vol. 16 Issue 23, p3913. 37p. - Publication Year :
- 2024
-
Abstract
- Simple Summary: To improve cancer classification performance for high-dimensional microarray datasets, this work proposes combining filter and differential evolutionary (DE) algorithm feature selection techniques. By scoring genes or features of high-dimensional microarray datasets by some common filter methods, we keep only the highest-ranked features and eliminate superfluous and irrelevant ones to decrease the dimensionality of the microarray datasets. Then, the genes or features of the microarray datasets are optimized further by DE, producing noticeably better classification results. This could lead to outstanding improvement in the cancer classification using only less features of the microarray datasets. Background: In recent years, microarray datasets have been used to store information about human genes and methods used to express the genes in order to successfully diagnose cancer disease in the early stages. However, most of the microarray datasets typically contain thousands of redundant, irrelevant, and noisy genes, which raises a great challenge for effectively applying the machine learning algorithms to these high-dimensional microarray datasets. Methods: To address this challenge, this paper introduces a proposed hybrid filter and differential evolution-based feature selection to choose only the most influential genes or features of high-dimensional microarray datasets to improve cancer diagnoses and classification. The proposed approach is a two-phase hybrid feature selection model constructed using selecting the top-ranked features by some popular filter feature selection methods and then further identifying the most optimal features conducted by differential evolution (DE) optimization. Accordingly, some popular machine learning algorithms are trained using the final training microarray datasets with only the best features in order to produce outstanding cancer classification results. Four high-dimensional cancerous microarray datasets were used in this study to evaluate the proposed method, which are Breast, Lung, Central Nervous System (CNS), and Brain cancer datasets. Results: The experimental results demonstrate that the classification accuracy results achieved by the proposed hybrid filter-DE over filter methods increased to 100%, 100%, 93%, and 98% on Brain, CNS, Breast and Lung, respectively. Furthermore, applying the suggested DE-based feature selection contributed to removing around 50% of the features selected by using the filter methods for these four cancerous microarray datasets. The average improvement percentages of accuracy achieved by the proposed methods were up to 42.47%, 57.45%, 16.28% and 43.57% compared to the previous works that are 41.43%, 53.66%, 17.53%, 61.70% on Brain, CNS, Lung and Breast datasets, respectively. Conclusions: Compared to the previous works, the proposed methods accomplished better improvement percentages on Brain and CNS datasets, comparable improvement percentages on Lung dataset, and less improvement percentages on Breast dataset. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20726694
- Volume :
- 16
- Issue :
- 23
- Database :
- Academic Search Index
- Journal :
- Cancers
- Publication Type :
- Academic Journal
- Accession number :
- 181660912
- Full Text :
- https://doi.org/10.3390/cancers16233913