Back to Search
Start Over
Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases
- Source :
- Optimization Letters. 11:329-341
- Publication Year :
- 2015
- Publisher :
- Springer Science and Business Media LLC, 2015.
-
Abstract
- This paper addresses the problem of insufficient performance of statistical classification with the medium-sized database (thousands of classes). Each object is represented as a sequence of independent segments. Each segment is defined as a random sample of independent features with the distribution of multivariate exponential type. To increase the speed of the optimal Kullback–Leibler minimum information discrimination principle, we apply the clustering of the training set and an approximate nearest neighbor search of the input object in a set of cluster medoids. By using the asymptotic properties of the Kullback–Leibler divergence, we propose the maximal likelihood search procedure. In this method the medoid to check is selected from the cluster with the maximal joint density (likelihood) of the distances to the previously checked medoids. Experimental results in image recognition with artificially generated dataset and Essex facial database prove that the proposed approach is much more effective, than an exhaustive search and the known approximate nearest neighbor methods from FLANN and NonMetricSpace libraries.
- Subjects :
- 021103 operations research
Control and Optimization
Database
business.industry
Nearest neighbor search
0211 other engineering and technologies
Brute-force search
Pattern recognition
02 engineering and technology
computer.software_genre
Medoid
k-nearest neighbors algorithm
Statistical classification
ComputingMethodologies_PATTERNRECOGNITION
Exponential family
Computer Science::Computer Vision and Pattern Recognition
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
Cluster analysis
business
Divergence (statistics)
computer
Mathematics
Subjects
Details
- ISSN :
- 18624480 and 18624472
- Volume :
- 11
- Database :
- OpenAIRE
- Journal :
- Optimization Letters
- Accession number :
- edsair.doi...........9935dad0061d05c1c556e38e652521d9
- Full Text :
- https://doi.org/10.1007/s11590-015-0948-6