1. Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering
- Author
-
Shibing Zhou, Fei Liu, and Zhenyuan Xu
- Subjects
Fuzzy clustering ,Brown clustering ,Computer Networks and Communications ,business.industry ,020209 energy ,Single-linkage clustering ,Correlation clustering ,Pattern recognition ,02 engineering and technology ,computer.software_genre ,Complete-linkage clustering ,Computer Science Applications ,Hierarchical clustering ,Determining the number of clusters in a data set ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Artificial intelligence ,Cluster analysis ,business ,computer ,Software ,Mathematics - Abstract
It is crucial to determine the optimal number of clusters for the clustering quality in cluster analysis. From the standpoint of sample geometry, two concepts, i.e., the sample clustering dispersion degree and the sample clustering synthesis degree, are defined, and a new clustering validity index is designed. Moreover, a method for determining the optimal number of clusters based on an agglomerative hierarchical clustering (AHC) algorithm is proposed. The new index and the method can evaluate the clustering results produced by the AHC and determine the optimal number of clusters for multiple types of datasets, such as linear, manifold, annular, and convex structures. Theoretical research and experimental results indicate the validity and good performance of the proposed index and the method.
- Published
- 2017