301. Estimation of Class Membership Probabilities in the Document Classification.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Zhi-Hua Zhou, Hang Li, Qiang Yang, Takahashi, Kazuko, Takamura, Hiroya, and Okumura, Manabu
- Abstract
We propose a method for estimating class membership probabilities of a predicted class, using classification scores not only for the predicted class but also for other classes in a document classification. Class membership probabilities are important in many applications in document classification, in which multiclass classification is often applied. In the proposed method, we first make an accuracy table by counting the number of correctly classified training samples in each range or cell of classification scores. We then apply smoothing methods such as a moving average method with coverage to the accuracy table. In order to determine the class membership probability of an unknown sample, we first calculate the classification scores of the sample, then find the range or cell that corresponds to the scores and output the values associated in the range or cell in the accuracy table. Through experiments on two different datasets with both Support Vector Machines and Naive Bayes classifiers, we empirically show that the use of multiple classification scores is effective in the estimation of class membership probabilities, and that the proposed smoothing methods for the accuracy table work quite well. We also show that the estimated class membership probabilities by the proposed method are useful in the detection of the misclassified samples. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF