Back to Search
Start Over
A clustering-Based KNN improved algorithm CLKNN for text classification
- Source :
- 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010).
- Publication Year :
- 2010
- Publisher :
- IEEE, 2010.
-
Abstract
- As a simple, effective and nonparametric classification method, k Nearest Neighbor (KNN) is widely used in document classification for dealing with the much more difficult problem such as large-scale or many of categories. But KNN classifier may have a problem when training samples are uneven. The problem is that KNN classifier may decrease the precision of classification because of the uneven density of training data. To solve the problem, a new clustering-based KNN method is presented in this paper. It preprocesses training data by using clustering , then classify with a new KNN algorithm, which adopts a dynamic adjustment in each iteration for the neighborhood number K.This method would avoid the uneven classification phenomenon and reduce the misjudgment of the boundary testing samples. We have an experiment in text classification and the result shows that it has good performance.
- Subjects :
- Training set
Computer science
business.industry
Document classification
Pattern recognition
Boundary testing
computer.software_genre
Knn classifier
k-nearest neighbors algorithm
Statistical classification
ComputingMethodologies_PATTERNRECOGNITION
Algorithm design
Artificial intelligence
Data mining
business
Cluster analysis
computer
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010)
- Accession number :
- edsair.doi...........6c6a95c567c7a882a571e6a5aa3669b2