A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

Authors :: Brusco MJ
Shireman E
Steinley D
Source :: Psychological methods [Psychol Methods] 2017 Sep; Vol. 22 (3), pp. 563-580. Date of Electronic Publication: 2016 Sep 08.
Publication Year :: 2017
Abstract: The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record<br /> ((c) 2017 APA, all rights reserved).)

Subjects :: Algorithms
Factor Analysis, Statistical
Humans
Research Design
Cluster Analysis
Models, Psychological
Models, Statistical

Full Text Access

Tools