Back to Search Start Over

Evaluation of the number of clusters in a data set using p-values from multiple tests of hypotheses.

Authors :
Modak, Dr. Soumita
Source :
Communications in Statistics: Theory & Methods. 2024, Vol. 53 Issue 24, p8878-8889. 12p.
Publication Year :
2024

Abstract

This article proposes a novel, nonparametric, interpoint distance-based measure to investigate whether there exist any groups in a set of given data, and if so then, how many groups are prevailing in total. It is a cluster accuracy index useful for arbitrary-dimensional data set, in association with any clustering algorithm having the number of groups specified a priori. We perform univariate, nonparametric, multiple statistical tests of hypotheses, where as many dependent tests as the sample size are carried out using the interpoint distances. They possess p-values to be combined to reach a decision, which is taken in a step-wise process for a possible number of clusters. It reduces unnecessary computations compared with the other accuracy measures from the literature. Data study establishes the proposed index's efficiency and superiority. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
03610926
Volume :
53
Issue :
24
Database :
Academic Search Index
Journal :
Communications in Statistics: Theory & Methods
Publication Type :
Academic Journal
Accession number :
180625133
Full Text :
https://doi.org/10.1080/03610926.2024.2309967