Back to Search Start Over

Confidence Interval for F1F1 Measure of Algorithm Performance Based on Blocked 3×× 2 Cross-Validation.

Authors :
Wang, Yu
Li, Jihong
Li, Yanfang
Wang, Ruibo
Yang, Xingli
Source :
IEEE Transactions on Knowledge & Data Engineering. Mar2015, Vol. 27 Issue 3, p651-659. 9p.
Publication Year :
2015

Abstract

In studies on the application of machine learning such as Information Retrieval (IR), the focus is typically on the estimation of the $F_1$<alternatives> <inline-graphic xlink:type="simple" xlink:href="wang-ieq3-2359667.gif"/></alternatives> measure of algorithm performance. Approximate symmetrical confidence intervals constructed by the $F_1$ <alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq4-2359667.gif"/></alternatives> value based on cross-validated $t$<alternatives> <inline-graphic xlink:type="simple" xlink:href="wang-ieq5-2359667.gif"/></alternatives> distribution are commonly used in the literature. However, theoretical analysis on the distribution of $F_1$ <alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq6-2359667.gif"/></alternatives> values shows that such distribution is actually non-symmetrical. Thus, simply using symmetrical distribution to approximate non-symmetrical distribution may be inappropriate and may result in a low degree of confidence and long interval length for the confidence interval. In the present study, a non-symmetrical confidence interval of the $F_1$<alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq7-2359667.gif"/> </alternatives> measure based on Beta prime distribution is constructed by using the $F_1$<alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq8-2359667.gif"/> </alternatives> value computed based on the average confusion matrix of a blocked $3\times2$<alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq9-2359667.gif"/> </alternatives> cross-validation. Experimental results show that in most cases, our method has high degrees of confidence. With an acceptable degree of confidence, our method has a shorter interval length than the approximate symmetrical confidence intervals based on the blocked $3\times 2$<alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq10-2359667.gif"/></alternatives> and $5 \times 2$<alternatives> <inline-graphic xlink:type="simple" xlink:href="wang-ieq11-2359667.gif"/></alternatives> cross-validated $t$<alternatives><inline-graphic xlink:type="simple" xlink:href="wang-ieq12-2359667.gif"/> </alternatives> distributions. The approximate symmetrical confidence interval based on the $10$<alternatives> <inline-graphic xlink:type="simple" xlink:href="wang-ieq13-2359667.gif"/></alternatives>-fold cross-validated $t$<alternatives> <inline-graphic xlink:type="simple" xlink:href="wang-ieq14-2359667.gif"/></alternatives> distribution has the shortest interval length of the four confidence intervals but with low degrees of confidence in all cases. Taking these two factors into consideration, our method is recommended. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10414347
Volume :
27
Issue :
3
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
100871769
Full Text :
https://doi.org/10.1109/TKDE.2014.2359667