1. Efficient Centroid-Linkage Clustering
- Author
-
Bateni, MohammadHossein, Dhulipala, Laxman, Fletcher, Willem, Gowda, Kishen N, Hershkowitz, D Ellis, Jayaram, Rajesh, and Łącki, Jakub
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We give an efficient algorithm for Centroid-Linkage Hierarchical Agglomerative Clustering (HAC), which computes a $c$-approximate clustering in roughly $n^{1+O(1/c^2)}$ time. We obtain our result by combining a new Centroid-Linkage HAC algorithm with a novel fully dynamic data structure for nearest neighbor search which works under adaptive updates. We also evaluate our algorithm empirically. By leveraging a state-of-the-art nearest-neighbor search library, we obtain a fast and accurate Centroid-Linkage HAC algorithm. Compared to an existing state-of-the-art exact baseline, our implementation maintains the clustering quality while delivering up to a $36\times$ speedup due to performing fewer distance comparisons.
- Published
- 2024