Back to Search Start Over

Clustering Mixed-Type Data via Dirichlet Process Mixture Model with Cluster-Specific Covariance Matrices

Authors :
Nurul Afiqah Burhanuddin
Kamarulzaman Ibrahim
Hani Syahida Zulkafli
Norwati Mustapha
Source :
Symmetry, Vol 16, Iss 6, p 712 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

Many studies have shown successful applications of the Dirichlet process mixture model (DPMM) for clustering continuous data. Beyond continuous data, in practice, one can expect to see different data types, including ordinal and nominal data. Existing DPMMs for clustering mixed-type data assume a strict covariance matrix structure, resulting in an overfit model. This article explores a DPMM for mixed-type data that allows the covariance matrix to differ from one cluster to another. We assume an underlying latent variable framework for ordinal and nominal data, which is then modeled jointly with the continuous data. The identifiability issue on the covariance matrix poses computational challenges, thus requiring a nonstandard inferential algorithm. The applicability and flexibility of the proposed model are illustrated through simulation examples and real data applications.

Details

Language :
English
ISSN :
20738994
Volume :
16
Issue :
6
Database :
Directory of Open Access Journals
Journal :
Symmetry
Publication Type :
Academic Journal
Accession number :
edsdoj.70020d391e1741b9804cfaadb1fd0729
Document Type :
article
Full Text :
https://doi.org/10.3390/sym16060712