Back to Search
Start Over
Towards enhancing centroid classifier for text classification—A border-instance approach
- Source :
-
Neurocomputing . Feb2013, Vol. 101, p299-308. 10p. - Publication Year :
- 2013
-
Abstract
- Text classification/categorization (TC) is to assign new unlabeled natural language documents to the predefined thematic categories. Centroid-based classifier (CC) has been widely used for TC because of its simplicity and efficiency. However, it has also been long criticized for its relatively low classification accuracy compared with state-of-the-art classifiers such as support vector machines (SVMs). In this paper, we find that for CC using only border instances rather than all instances to construct centroid vectors can obtain higher generalization accuracy. Along this line, we propose Border-Instance-based Iteratively Adjusted Centroid Classifier (IACC_BI), which relies on the border instances found by some routines, e.g. 1-Nearest-and-1-Furthest-Neighbors strategy, to construct centroid vectors for CC. IACC_BI then iteratively adjusts the initial centroid vectors according to the misclassified training instances. Our extensive experiments on 11 real-world text corpora demonstrate that IACC_BI improves the performance of centroid-based classifiers greatly and obtains classification accuracy competitive to the well-known SVMs, while at significantly lower computational costs. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09252312
- Volume :
- 101
- Database :
- Academic Search Index
- Journal :
- Neurocomputing
- Publication Type :
- Academic Journal
- Accession number :
- 83323229
- Full Text :
- https://doi.org/10.1016/j.neucom.2012.08.019