Back to Search
Start Over
On Estimating Model in Feature Selection With Cross-Validation
- Source :
- IEEE Access, Vol 7, Pp 33454-33463 (2019)
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- Both wrapper and hybrid methods in feature selection need the intervention of learning algorithm to train parameters. The preset parameters and dataset are used to construct several sub-optimal models, from which the final model is selected. The question is how to evaluate the performance of these sub-optimal models? What are the effects of different evaluation methods of sub-optimal model on the result of feature selection? Aiming at the evaluation problem of predictive models in feature selection, we chose a hybrid feature selection algorithm, FDHSFFS, and conducted comparative experiments on four UCI datasets with large differences in feature dimension and sample size by using five different cross-validation (CV) methods. The experimental results show that in the process of feature selection, twofold CV and leave-one-out-CV are more suitable for the model evaluation of low-dimensional and small sample datasets, tenfold nested CV and tenfold CV are more suitable for the model evaluation of high-dimensional datasets; tenfold nested CV is close to the unbiased estimation, and different optimal models may choose the same approximate optimal feature subset.
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 7
- Database :
- Directory of Open Access Journals
- Journal :
- IEEE Access
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.1ffa7be89d414c1aabac1c811d3c4590
- Document Type :
- article
- Full Text :
- https://doi.org/10.1109/ACCESS.2019.2892062