Back to Search
Start Over
Distributed Density Peaks Clustering Revisited.
- Source :
- IEEE Transactions on Knowledge & Data Engineering; Aug2022, Vol. 34 Issue 8, p3714-3726, 13p
- Publication Year :
- 2022
-
Abstract
- Density Peaks (DP) Clustering organizes data into clusters by finding peaks in dense regions. This involves computing density ($\rho$ ρ ) and distance ($\delta$ δ ) of every point. As such, though DP has been very effective in producing high quality clusters, their complexity is O($N^2$ N 2 ) where $N$ N is the number of data points. In this paper, we propose a fast distributed density peaks clustering algorithm, FDDP, based on the z-value index. In FDDP, we first employ the z-value index to map multi-dimensional data points into one dimensional space, and then range-partition the data according to the z-value to balance the load across the processing nodes. We ensure minimal overlapping range to handle computations at the boundary points. We also propose FC, an efficient algorithm that employs a forward computing strategy to calculate $\rho$ ρ linearly. Additionally, we propose another algorithm, CB, which uses a caching and efficient searching strategy to compute $\delta$ δ . Moreover, FDDP is able to reduce the time complexity from $O(N^2)$ O (N 2) to $O(N\cdot log(N))$ O (N · l o g (N)) . We provide a theoretical analysis of FDDP and evaluated FDDP empirically. Our experimental results show that FDDP outperforms the state-of-the-art algorithms significantly. [ABSTRACT FROM AUTHOR]
- Subjects :
- DENSITY
PARALLEL algorithms
Subjects
Details
- Language :
- English
- ISSN :
- 10414347
- Volume :
- 34
- Issue :
- 8
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Knowledge & Data Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 157931421
- Full Text :
- https://doi.org/10.1109/TKDE.2020.3034611