1. Neighbor-Relationship-Based Adaptive Density Peak Clustering
- Author
-
Zhigang Su, Qian Gao, Jingtang Hao, Yue Wang, and Bing Han
- Subjects
Spatial clustering ,density peak ,uneven density ,neighbor relationship ,automatic clustering ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The Density Peak Clustering (DPC) algorithm encounters challenges such as difficulty in choosing cluster centers and the chain reaction caused by incorrect assignment of data points when clustering spatial datasets containing clusters with significant density differences or multi-peak clusters. To address these problems, in this paper, starting from enhancing the local density definition, optimizing the selection of cluster centers, and improving the assignment strategy of non-cluster center data points, an Adaptive DPC (NRA-DPC) algorithm is proposed based on the neighbor relationship. The NRA-DPC algorithm utilizes the reverse K-nearest neighbors of data points as the basis for defining the local density of data points and divides the spatial dataset into a core point set and a boundary point set based on the number of elements in the reverse K-nearest neighbor set of data points. The idea of iteration is adopted to select cluster centers from the core point set and assign non-cluster center data points, forming the initial clusters. For each initial cluster formed by the core point set, the corresponding minimum spanning tree (MST) is generated, and based on the average edge length of the MST, the assignment threshold of this cluster is set. The boundary point set completes the corresponding data point assignment task based on this assignment threshold and the mutual K-nearest neighbor relationship. Experimental results indicate that, compared with other typical clustering algorithms, the NRA-DPC algorithm can automatically select cluster centers, reduce the probability of incorrect assignment of non-cluster center data points, and effectively suppress the chain reaction triggered by incorrect assignment of non-cluster center data points, demonstrating more stable clustering performance when dealing with different datasets.
- Published
- 2024
- Full Text
- View/download PDF