• Corpus ID: 865255

Latent class models for clustering : a comparison with K-means

  title={Latent class models for clustering : a comparison with K-means},
  author={Jay Magidson and Jeroen K. Vermunt},
Recent developments in latent class (LC) analysis and associated software to include continuous variables offer a model-based alternative to more traditional clustering approaches such as K-means. In this paper, the authors compare these two approaches using data simulated from a setting where true group membership is known. The authors choose a setting favourable to K-means by simulating data according to the assumptions made in both discriminant analysis (DISC) and K-means clustering. Since… 

Figures and Tables from this paper

A Comparison of Latent Class, K-Means, and K-Median Methods for Clustering Dichotomous Data
Simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data found that the 3 approaches can exhibit profound differences when applied to real data.
Clustering in the field of social sciences: that is your choice
This research considered three data-sets: one with solely continuous variables, one with only binary variables and one with mixed variables, and methodologically LCM performed reasonably well; in contrast, cluster analysis achieved both the best performance and the worst performance.
Latent Class Analysis
Goodman’s LC model is referred to as the traditional LC model, which generalizes traditional cluster, factor, and item response theory analyses and also generalizes various kinds of regression modeling where the parameters that differ for different classes are the regression coefficients.
Clique Partitioning for Clustering: A Comparison with K-Means and Latent Class Analysis
This article illustrates the use of a new formulation for the clique partitioning problem that is readily solvable by basic metaheuristic methodologies such as Tabu Search, and enables the widespread use of CP for clustering in practice.
An Animated Guide: An Introduction to Latent Class Clustering in SAS£
This paper explores PROC LCA, a free SAS add-in created by The Methodology Center at Penn State University that allows users of SAS to perform Latent Class Clustering using syntax with which they are already familiar.
Comparing Different Approaches for Clustering Categorical Data
Two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study.
Identification of Latent Classes in Mixture Models: A Monte Carlo Simulation Study
It can be concluded that under certain design conditions, less parameterized and less complex methods perform just as well as factor mixture modeling in detecting latent classes.
Latent Class Models
Title Clustering of multivariate binary data with dimension reductionvia L 1-regularized likelihood maximization
This work presents a novel procedure for simultaneously determining the optimal cluster structure for multivariate binary data and the subspace to represent that cluster structure, based on a finite mixture model of multivariate Bernoulli distributions.
Mixture models for ordinal data: a pairwise likelihood approach
A latent Gaussian mixture model to classify ordinal data is proposed that allows us to overcome the computational problems arising in the full maximum likelihood approach due to the evaluation of multidimensional integrals that cannot be written in closed form.


Latent Class Factor and Cluster Models, Bi-Plots, and Related Graphical Displays
Analyses over several data sets suggest that LC factor models typically fit data better and provide results that are easier to interpret than the corresponding LC cluster models.
Latent class modeling as a probabilistic extension of K-means clustering
According to Kaufman and Rousseeuw (1990), cluster analysis is "the classification of similar objects into groups, where the number of groups, as well as their forms are unknown". This same
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
The problems of determining the number of clusters and the clustering method are solved simultaneously by choosing the best model, and the EM result provides a measure of uncertainty about the associated classification of each data point.
Model-based Gaussian and non-Gaussian clustering
The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967), but it is restricted to Gaussian distributions and it does not allow for noise.
LADI: A Latent Discriminant Model for Analyzing Marketing Research Data
A general, flexible LAtent DIscriminant model is described, which accommodates descriptor variables having different scale properties, allows for the investigation of group structure, provides a statistical test of the number of latent clusters to retain, and allows for constraints to be imposed on the solution.
Exploratory latent structure analysis using both identifiable and unidentifiable models
SUMMARY This paper considers a wide class of latent structure models. These models can serve as possible explanations of the observed relationships among a set of m manifest polytomous variables. The
MCLUST: Software for Model-Based Cluster and Discriminant Analysis
MCLUST is a software package for cluster and discriminant analysis written in Fortran and interfaced to the S-PLUS commercial software package and the freely available R language 2 which has a
Bayesian Classification (AutoClass): Theory and Results
It is emphasized that no current unsupervised classi cation system can produce maximally useful results when operated alone and that it is the interaction between domain experts and the machine searching over the model space that generates new knowledge.
Latent class cluster analysis
Users may download and print one copy of any publication from the public portal for the purpose of private study or research You may not further distribute the material or use it for any
An Introduction to Multivariate Statistical Analysis
The introduction to multivariate statistical analysis is universally compatible with any devices to read, and will help you to cope with some harmful bugs inside their desktop computer.