1. Cluster ensemble selection based on a new cluster stability measure.
- Author
-
Alizadeh, Hosein, Minaei-Bidgoli, Behrouz, and Parvin, Hamid
- Subjects
- *
CLUSTER analysis (Statistics) , *PARALLEL algorithms , *DATA analysis , *QUANTITATIVE research , *PROBABILITY theory , *INFORMATION theory - Abstract
Many stability measures, such as Normalized Mutual Information (NMI), have been proposed to validate a set of partitionings. It is highly possible that a set of partitionings may contain one (or more) high quality cluster(s) but is still adjudged a bad cluster by a stability measure, and as a result, is completely neglected. Inspired by evaluation approaches measuring the efficacy of a set of partitionings, researchers have tried to define new measures for evaluating a cluster. Thus far, the measures defined for assessing a cluster are mostly based on the well-known NMI measure. The drawback of this commonly used approach is discussed in this paper, after which a new asymmetric criterion, called the Alizadeh-Parvin-Moshki-Minaei criterion (APMM), is proposed to assess the association between a cluster and a set of partitionings. We show that the APMM criterion overcomes the deficiency in the conventional NMI measure. We also propose a clustering ensemble framework that incorporates the APMM's capabilities in order to find the best performing clusters. The framework uses Average APMM (AAPMM) as a fitness measure to select a number of clusters instead of using all of the results. Any cluster that satisfies a predefined threshold of the mentioned measure is selected to participate in an elite ensemble. To combine the chosen clusters, a co-association matrix-based consensus function (by which the set of resultant partitionings are obtained) is used. Because Evidence Accumulation Clustering (EAC) can not derive the co-association matrix from a subset of clusters appropriately, a new EAC-based method, called Extended EAC (EEAC), is employed to construct the co-association matrix from the chosen subset of clusters. Empirical studies show that our proposed approach outperforms other cluster ensemble approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF