Back to Search
Start Over
Clustering with minimum spanning trees: How good can it be?
- Source :
- Journal of Classification, 2024
- Publication Year :
- 2023
-
Abstract
- Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in low-dimensional partitional data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can be very competitive. Next, we review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes. This leads to some new noteworthy approaches. Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms such as K-means, Gaussian mixtures, spectral clustering, Birch, density-based, and classical hierarchical agglomerative procedures. Nevertheless, we identify that there is still some room for improvement, and thus the development of novel algorithms is encouraged.
- Subjects :
- Statistics - Machine Learning
Computer Science - Machine Learning
Subjects
Details
- Database :
- arXiv
- Journal :
- Journal of Classification, 2024
- Publication Type :
- Report
- Accession number :
- edsarx.2303.05679
- Document Type :
- Working Paper
- Full Text :
- https://doi.org/10.1007/s00357-024-09483-1