Back to Search
Start Over
Clustering strategy and method selection
- Publication Year :
- 2015
-
Abstract
- This paper is a chapter in the forthcoming Handbook of Cluster Analysis, Hennig et al. (2015). For definitions of basic clustering methods and some further methodology, other chapters of the Handbook are referred to. To read this version of the paper without the Handbook, some knowledge of cluster analysis methodology is required. The aim of this chapter is to provide a framework for all the decisions that are required when carrying out a cluster analysis in practice. A general attitude to clustering is outlined, which connects these decisions closely to the clustering aims in a given application. From this point of view, the chapter then discusses aspects of data processing such as the choice of the representation of the objects to be clustered, dissimilarity design, transformation and standardization of variables. Regarding the choice of the clustering method, it is explored how different methods correspond to different clustering aims. Then an overview of benchmarking studies comparing different clustering methods is given, as well as an out- line of theoretical approaches to characterize desiderata for clustering by axioms. Finally, aspects of cluster validation, i.e., the assessment of the quality of a clustering in a given dataset, are discussed, including finding an appropriate number of clusters, testing homogeneity, internal and external cluster validation, assessing clustering stability and data visualization.
- Subjects :
- Statistics - Methodology
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.1503.02059
- Document Type :
- Working Paper