Back to Search Start Over

Clustering Categorical Data in a Multiobjective Framework

Authors :
Ujjwal Maulik
Sanghamitra Bandyopadhyay
Anirban Mukhopadhyay
Source :
Multiobjective Genetic Algorithms for Clustering ISBN: 9783642166143
Publication Year :
2011
Publisher :
Springer Berlin Heidelberg, 2011.

Abstract

Most of the clustering algorithms are designed for such datasets where the dissimilarity between any two points of the dataset can be computed using standard distance measures such as the Euclidean distance. However, many real-life datasets are categorical in nature, where no natural ordering can be found among the elements in the attribute domain. In such situations, the clustering algorithms, such as K-means [238] and fuzzy C-means (FCM) [62], cannot be applied. The K-means algorithm computes the center of a cluster by computing the mean of the set of feature vectors belonging to that cluster. However, as categorical datasets do not have any inherent distance measure, computing the mean of a set of feature vectors is meaningless. A variation of the K-means algorithm, namely Partitioning Around Medoids (PAM) or K-medoids [243], has been proposed for such kinds of datasets.

Details

ISBN :
978-3-642-16614-3
ISBNs :
9783642166143
Database :
OpenAIRE
Journal :
Multiobjective Genetic Algorithms for Clustering ISBN: 9783642166143
Accession number :
edsair.doi...........f063308cf821ad8784e84e8cefce343f
Full Text :
https://doi.org/10.1007/978-3-642-16615-0_8