1. Distributed Density Peaks Clustering Revisited.
- Author
-
Lu, Jing, Zhao, Yuhai, Tan, Kian-Lee, and Wang, Zhengkui
- Subjects
DENSITY ,PARALLEL algorithms - Abstract
Density Peaks (DP) Clustering organizes data into clusters by finding peaks in dense regions. This involves computing density ($\rho$ ρ ) and distance ($\delta$ δ ) of every point. As such, though DP has been very effective in producing high quality clusters, their complexity is O($N^2$ N 2 ) where $N$ N is the number of data points. In this paper, we propose a fast distributed density peaks clustering algorithm, FDDP, based on the z-value index. In FDDP, we first employ the z-value index to map multi-dimensional data points into one dimensional space, and then range-partition the data according to the z-value to balance the load across the processing nodes. We ensure minimal overlapping range to handle computations at the boundary points. We also propose FC, an efficient algorithm that employs a forward computing strategy to calculate $\rho$ ρ linearly. Additionally, we propose another algorithm, CB, which uses a caching and efficient searching strategy to compute $\delta$ δ . Moreover, FDDP is able to reduce the time complexity from $O(N^2)$ O (N 2) to $O(N\cdot log(N))$ O (N · l o g (N)) . We provide a theoretical analysis of FDDP and evaluated FDDP empirically. Our experimental results show that FDDP outperforms the state-of-the-art algorithms significantly. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF