Back to Search Start Over

On Estimating Model in Feature Selection With Cross-Validation

Authors :
Chunxia Qi
Jiandong Diao
Like Qiu
Source :
IEEE Access, Vol 7, Pp 33454-33463 (2019)
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Both wrapper and hybrid methods in feature selection need the intervention of learning algorithm to train parameters. The preset parameters and dataset are used to construct several sub-optimal models, from which the final model is selected. The question is how to evaluate the performance of these sub-optimal models? What are the effects of different evaluation methods of sub-optimal model on the result of feature selection? Aiming at the evaluation problem of predictive models in feature selection, we chose a hybrid feature selection algorithm, FDHSFFS, and conducted comparative experiments on four UCI datasets with large differences in feature dimension and sample size by using five different cross-validation (CV) methods. The experimental results show that in the process of feature selection, twofold CV and leave-one-out-CV are more suitable for the model evaluation of low-dimensional and small sample datasets, tenfold nested CV and tenfold CV are more suitable for the model evaluation of high-dimensional datasets; tenfold nested CV is close to the unbiased estimation, and different optimal models may choose the same approximate optimal feature subset.

Details

Language :
English
ISSN :
21693536
Volume :
7
Database :
OpenAIRE
Journal :
IEEE Access
Accession number :
edsair.doi.dedup.....6613f8a0f6cc903e98e07604aa65f98a