Back to Search Start Over

FLASC: A Flare-Sensitive Clustering Algorithm

Authors :
Bot, D. M.
Peeters, J.
Liesenborgs, J.
Aerts, J.
Publication Year :
2023

Abstract

Clustering algorithms are often used to find subpopulations in exploratory data analysis workflows. Not only the clusters themselves, but also their shape can represent meaningful subpopulations. In this paper, we present FLASC, an algorithm that detects branches within clusters to identify such subpopulations. FLASC builds upon HDBSCAN*, a state-of-the-art density-based clustering algorithm, and detects branches in a post-processing step that describes within-cluster connectivity. Two variants of the algorithm are presented, which trade computational cost for noise robustness. We show that both variants scale similarly to HDBSCAN* in terms of computational cost and provide stable outputs using synthetic data sets, resulting in an efficient flare-sensitive clustering algorithm. In addition, we demonstrate the benefit of branch-detection on two real-world data sets.<br />Comment: Previously, 20 pages, 11 figures, submitted to ACM TKDD. Now, 15 pages, 8 figures, submitted to PeerJ Computer Science (simplified method and rewritten for clarity)

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2311.15887
Document Type :
Working Paper