Back to Search Start Over

Distributed Density Peaks Clustering Revisited.

Authors :
Lu, Jing
Zhao, Yuhai
Tan, Kian-Lee
Wang, Zhengkui
Source :
IEEE Transactions on Knowledge & Data Engineering; Aug2022, Vol. 34 Issue 8, p3714-3726, 13p
Publication Year :
2022

Abstract

Density Peaks (DP) Clustering organizes data into clusters by finding peaks in dense regions. This involves computing density ($\rho$ ρ ) and distance ($\delta$ δ ) of every point. As such, though DP has been very effective in producing high quality clusters, their complexity is O($N^2$ N 2 ) where $N$ N is the number of data points. In this paper, we propose a fast distributed density peaks clustering algorithm, FDDP, based on the z-value index. In FDDP, we first employ the z-value index to map multi-dimensional data points into one dimensional space, and then range-partition the data according to the z-value to balance the load across the processing nodes. We ensure minimal overlapping range to handle computations at the boundary points. We also propose FC, an efficient algorithm that employs a forward computing strategy to calculate $\rho$ ρ linearly. Additionally, we propose another algorithm, CB, which uses a caching and efficient searching strategy to compute $\delta$ δ . Moreover, FDDP is able to reduce the time complexity from $O(N^2)$ O (N 2) to $O(N\cdot log(N))$ O (N · l o g (N)) . We provide a theoretical analysis of FDDP and evaluated FDDP empirically. Our experimental results show that FDDP outperforms the state-of-the-art algorithms significantly. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
DENSITY
PARALLEL algorithms

Details

Language :
English
ISSN :
10414347
Volume :
34
Issue :
8
Database :
Complementary Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
157931421
Full Text :
https://doi.org/10.1109/TKDE.2020.3034611