Back to Search Start Over

Space Structure and Clustering of Categorical Data

Authors :
Qian, Yuhua
Li, Feijiang
Liang, Jiye
Liu, Bing
Dang, Chuangyin
Source :
IEEE Transactions on Neural Networks and Learning Systems; October 2016, Vol. 27 Issue: 10 p2047-2059, 13p
Publication Year :
2016

Abstract

Learning from categorical data plays a fundamental role in such areas as pattern recognition, machine learning, data mining, and knowledge discovery. To effectively discover the group structure inherent in a set of categorical objects, many categorical clustering algorithms have been developed in the literature, among which <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-modes-type algorithms are very representative because of their good performance. Nevertheless, there is still much room for improving their clustering performance in comparison with the clustering algorithms for the numeric data. This may arise from the fact that the categorical data lack a clear space structure as that of the numeric data. To address this issue, we propose, in this paper, a novel data-representation scheme for the categorical data, which maps a set of categorical objects into a Euclidean space. Based on the data-representation scheme, a general framework for space structure based categorical clustering algorithms (SBC) is designed. This framework together with the applications of two kinds of dissimilarities leads two versions of the SBC-type algorithms. To verify the performance of the SBC-type algorithms, we employ as references four representative algorithms of the <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-modes-type algorithms. Experiments show that the proposed SBC-type algorithms significantly outperform the <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-modes-type algorithms.

Details

Language :
English
ISSN :
2162237x and 21622388
Volume :
27
Issue :
10
Database :
Supplemental Index
Journal :
IEEE Transactions on Neural Networks and Learning Systems
Publication Type :
Periodical
Accession number :
ejs40042546
Full Text :
https://doi.org/10.1109/TNNLS.2015.2451151