Back to Search Start Over

Towards enhancing centroid classifier for text classification—A border-instance approach

Authors :
Wang, Deqing
Wu, Junjie
Zhang, Hui
Xu, Ke
Lin, Mengxiang
Source :
Neurocomputing. Feb2013, Vol. 101, p299-308. 10p.
Publication Year :
2013

Abstract

Text classification/categorization (TC) is to assign new unlabeled natural language documents to the predefined thematic categories. Centroid-based classifier (CC) has been widely used for TC because of its simplicity and efficiency. However, it has also been long criticized for its relatively low classification accuracy compared with state-of-the-art classifiers such as support vector machines (SVMs). In this paper, we find that for CC using only border instances rather than all instances to construct centroid vectors can obtain higher generalization accuracy. Along this line, we propose Border-Instance-based Iteratively Adjusted Centroid Classifier (IACC_BI), which relies on the border instances found by some routines, e.g. 1-Nearest-and-1-Furthest-Neighbors strategy, to construct centroid vectors for CC. IACC_BI then iteratively adjusts the initial centroid vectors according to the misclassified training instances. Our extensive experiments on 11 real-world text corpora demonstrate that IACC_BI improves the performance of centroid-based classifiers greatly and obtains classification accuracy competitive to the well-known SVMs, while at significantly lower computational costs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
101
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
83323229
Full Text :
https://doi.org/10.1016/j.neucom.2012.08.019