301. Choosing Between Two Classification Learning Algorithms Based on Calibrated Balanced $$5\times 2$$ Cross-Validated F-Test.
- Author
-
Wang, Yu, Li, Jihong, and Li, Yanfang
- Subjects
MACHINE learning ,CLASSIFICATION algorithms ,CALIBRATION ,F-test (Mathematical statistics) ,DEGREES of freedom - Abstract
$$5\times 2$$ cross-validated F-test based on independent five replications of 2-fold cross-validation is recommended in choosing between two classification learning algorithms. However, the reusing of the same data in a $$5\times 2$$ cross-validation causes the real degree of freedom (DOF) of the test to be lower than the F(10, 5) distribution given by (Neural Comput 11:1885-1892, [1]). This easily leads the test to suffer from high type I and type II errors. Random partitions for $$5\times 2$$ cross-validation result in difficulty in analyzing the DOF for the test. In particular, Wang et al. (Neural Comput 26(1):208-235, [2]) proposed a new blocked $$3 \times 2$$ cross-validation, that considered the correlation between any two 2-fold cross-validations. Based on this, a calibrated balanced $$5\times 2$$ cross-validated F-test following F(7, 5) distribution is put forward in this study by calibrating the DOF for the F(10, 5) distribution. Simulated and real data studies demonstrate that the calibrated balanced $$5\times 2$$ cross-validated F-test has lower type I and type II errors than the $$5\times 2$$ cross-validated F-test following F(10, 5) in most cases. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF