Back to Search Start Over

Clustering with minimum spanning trees: How good can it be?

Authors :
Gagolewski, Marek
Cena, Anna
Bartoszuk, Maciej
Brzozowski, Łukasz
Source :
Journal of Classification, 2024
Publication Year :
2023

Abstract

Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in low-dimensional partitional data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can be very competitive. Next, we review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes. This leads to some new noteworthy approaches. Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms such as K-means, Gaussian mixtures, spectral clustering, Birch, density-based, and classical hierarchical agglomerative procedures. Nevertheless, we identify that there is still some room for improvement, and thus the development of novel algorithms is encouraged.

Details

Database :
arXiv
Journal :
Journal of Classification, 2024
Publication Type :
Report
Accession number :
edsarx.2303.05679
Document Type :
Working Paper
Full Text :
https://doi.org/10.1007/s00357-024-09483-1