K-Means Clustering with Infinite Feature Selection for Classification Tasks in Gene Expression Data

Authors :: Ghazali Sulong
Muhammad Akmal Remli
Mohd Saberi Mohamad
Shahreen Kasim
Sigeru Omatu
Hui Wen Nies
Safaai Deris
Kauthar Mohd Daud
Source :: 11th International Conference on Practical Applications of Computational Biology & Bioinformatics ISBN: 9783319608150, PACBB
Publication Year :: 2017
Publisher :: Springer International Publishing, 2017.
Abstract: In the bioinformatics and clinical research areas, microarray technology has been widely used to distinguish a cancer dataset between normal and tumour samples. However, the high dimensionality of gene expression data affects the classification accuracy of an experiment. Thus, feature selection is needed to select informative genes and remove non-informative genes. Some of the feature selection methods, yet, ignore the interaction between genes. Therefore, the similar genes are clustered together and dissimilar genes are clustered in other groups. Hence, to provide a higher classification accuracy, this research proposed k-means clustering and infinite feature selection for identifying informative genes in the selected subset. This research has been applied to colorectal cancer and small round blue cell tumors datasets. Eventually, this research successfully obtained higher classification accuracy than the previous work.

Subjects :: Cancer classification
Computer science
business.industry
Gene expression
Gene chip analysis
k-means clustering
Feature selection
Pattern recognition
Artificial intelligence
High dimensionality
Cluster analysis
business
Gene

ISBN :: 978-3-319-60815-0
ISBNs :: 9783319608150
Database :: OpenAIRE
Journal :: 11th International Conference on Practical Applications of Computational Biology & Bioinformatics ISBN: 9783319608150, PACBB
Accession number :: edsair.doi...........6758407ec09cd5dc1f5456d854dc9bb5
Full Text :: https://doi.org/10.1007/978-3-319-60816-7_7

Full Text Access

Tools