Back to Search
Start Over
Evaluation of the number of clusters in a data set using p-values from multiple tests of hypotheses.
- Source :
-
Communications in Statistics: Theory & Methods . 2024, Vol. 53 Issue 24, p8878-8889. 12p. - Publication Year :
- 2024
-
Abstract
- This article proposes a novel, nonparametric, interpoint distance-based measure to investigate whether there exist any groups in a set of given data, and if so then, how many groups are prevailing in total. It is a cluster accuracy index useful for arbitrary-dimensional data set, in association with any clustering algorithm having the number of groups specified a priori. We perform univariate, nonparametric, multiple statistical tests of hypotheses, where as many dependent tests as the sample size are carried out using the interpoint distances. They possess p-values to be combined to reach a decision, which is taken in a step-wise process for a possible number of clusters. It reduces unnecessary computations compared with the other accuracy measures from the literature. Data study establishes the proposed index's efficiency and superiority. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 03610926
- Volume :
- 53
- Issue :
- 24
- Database :
- Academic Search Index
- Journal :
- Communications in Statistics: Theory & Methods
- Publication Type :
- Academic Journal
- Accession number :
- 180625133
- Full Text :
- https://doi.org/10.1080/03610926.2024.2309967