Back to Search Start Over

Density-Based clustering in mapReduce with guarantees on parallel time, space, and solution quality

Authors :
Sepideh Aghamolaei
Mohammad Ghodsi
Source :
Transactions on Combinatorics, Vol 14, Iss 3, Pp 135-156 (2024)
Publication Year :
2024
Publisher :
University of Isfahan, 2024.

Abstract

A well-known clustering problem called Density-Based Spatial Clustering of Applications with Noise~(DBSCAN) involves computing the solutions of at least one disk range query per input point, computing the connected components of a graph, and bichromatic fixed-radius nearest neighbor. MapReduce class is a model where a sublinear number of machines, each with sublinear memory, run for a polylogarithmic number of parallel rounds. Most of these problems either require quadratic time in the sequential model or are hard to compute in a constant number of rounds in MapReduce. In the Euclidean plane, DBSCAN algorithms with near-linear time and a randomized parallel algorithm with a polylogarithmic number of rounds exist. We solve DBSCAN in the Euclidean plane in a constant number of rounds in MapReduce, assuming the minimum number of points in range queries is constant and each connected component fits inside the memory of a single machine and has a constant diameter.

Details

Language :
English
ISSN :
22518657 and 22518665
Volume :
14
Issue :
3
Database :
Directory of Open Access Journals
Journal :
Transactions on Combinatorics
Publication Type :
Academic Journal
Accession number :
edsdoj.3dd6ea04d1d64e14b54a0f5b42327c3b
Document Type :
article
Full Text :
https://doi.org/10.22108/toc.2024.138377.2091