Back to Search Start Over

Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction.

Authors :
Li, Mengmeng
Wang, Haofeng
Yang, Lifang
Liang, You
Shang, Zhigang
Wan, Hong
Source :
Expert Systems with Applications. Jul2020, Vol. 150, pN.PAG-N.PAG. 1p.
Publication Year :
2020

Abstract

• A fast hybrid dimensionality reduction method for classification is proposed. • Multi-strategy based feature selection is used to filter out irrelevant features. • Grouped feature extraction is used to remove redundancy among features. • The proposed method shows excellent efficiency and competitive classification performance. Dimensionality reduction is one basic and critical technology for data mining, especially in current "big data" era. As two different types of methods, feature selection and feature extraction each have their pros and cons. In this paper, we combine multi-strategy feature selection and grouped feature extraction and propose a novel fast hybrid dimension reduction method, incorporating their advantages of removing irrelevant and redundant information. Firstly, the intrinsic dimensionality of the data set is estimated by the maximum likelihood estimation method. Fisher Score and Information Gain based feature selection are used as multi-strategy methods to remove irrelevant features. With the redundancy among the selected features as clustering criterion, they are grouped into a certain amount of clusters. In every cluster, Principal Component Analysis (PCA) based feature extraction is carried out to remove redundant information. Four classical classifiers and representation entropy are used to evaluate the classification performance and information loss of the reduced set. The runtime results of different methods show that the proposed hybrid method is consistently much faster than the other three in almost all of the sets used. Meanwhile, the proposed method shows competitive classification performance, which has no significant difference basically compared with the other methods. The proposed method reduces the dimensionality of the raw data fast and it has excellent efficiency and competitive classification performance compared with the contrastive methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
150
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
142794641
Full Text :
https://doi.org/10.1016/j.eswa.2020.113277