MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering †
@inproceedings{Fraley2007MCLUSTV3, title={MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering †}, author={Chris Fraley and Adrian E. Raftery}, year={2007} }
MCLUST is a contributed R package for normal mixture modeling and model-based clustering. It provides functions for parameter estimation via the EM algorithm for normal mixture models with a variety of covariance structures, and functions for simulation from these models. Also included are functions that combine model-based hierarchical clustering, EM for mixture estimation and the Bayesian Information Criterion (BIC) in comprehensive strategies for clustering, density estimation and…
Figures and Tables from this paper
484 Citations
Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering
- Mathematics, Computer ScienceJ. Classif.
- 2007
A modified version of BIC is proposed, where the likelihood is evaluated at the MAP instead of the MLE, and the resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE.
Mixture model averaging for clustering
- Computer ScienceAdv. Data Anal. Classif.
- 2015
This work average multiple models that are in some sense close to the best one, thereby producing a weighted average of clustering results, and introduces a method for merging mixture components based on the adjusted Rand index.
Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution
- MathematicsStat. Comput.
- 2012
A new class of distributions, multivariate t distributions with the Box-Cox transformation, is proposed for mixture modeling, which provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues.
Using conditional independence for parsimonious model-based Gaussian clustering
- Computer ScienceStat. Comput.
- 2013
Novel models in which constraints on the component-specific variance matrices allow us to define Gaussian parsimonious clustering models are proposed, obtained by assuming that the variables can be partitioned into groups resulting to be conditionally independent within components, thus producing component- specific varianceMatrices with a block diagonal structure.
Genetic Algorithms for Subset Selection in Model-Based Clustering
- Computer Science
- 2016
The problem of subset selection is recast as a model comparison problem, and BIC is used to approximate Bayes factors, and the criterion proposed is based on the BIC difference between a candidate clustering model for the given subset and a model which assumes no clustering for the same subset.
Mixtures of modified t-factor analyzers for model-based clustering, classification, and discriminant
- Computer Science
- 2011
Cluster Analysis, Model Selection, and Prior Distributions on Models
- Economics
- 2014
A product partition model and a model selection procedure based on Bayes factors from intrinsic priors are developed and it is found that a new prior, the hierarchical uniform prior leads to consistent model selection procedures and has other desirable properties.
Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions
- Computer ScienceStat. Comput.
- 2012
A novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure is put forth, known as the tEIGEN family.
Model‐based clustering of longitudinal data
- Computer Science
- 2010
A new family of mixture models for the model‐based clustering of longitudinal data is introduced and the covariance structures of eight members are given and the associated maximum likelihood estimates for the parameters are derived via expectation–maximization (EM) algorithms.
References
SHOWING 1-10 OF 38 REFERENCES
Enhanced Model-Based Clustering, Density Estimation,
and Discriminant Analysis Software: MCLUST
- Computer ScienceJ. Classif.
- 2003
MCLUST is a software package for model-based clustering, density estimation
and discriminant analysis interfaced to the S-PLUS commercial software and the R language.
It implements parameterized…
Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering
- Mathematics, Computer ScienceJ. Classif.
- 2007
A modified version of BIC is proposed, where the likelihood is evaluated at the MAP instead of the MLE, and the resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE.
Model-based Gaussian and non-Gaussian clustering
- Computer Science
- 1993
The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967), but it is restricted to Gaussian distributions and it does not allow for noise.
Model-Based Clustering, Discriminant Analysis, and Density Estimation
- Computer Science
- 2002
This work reviews a general methodology for model-based clustering that provides a principled statistical approach to important practical questions that arise in cluster analysis, such as how many clusters are there, which clustering method should be used, and how should outliers be handled.
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
- Computer ScienceComput. J.
- 1998
The problems of determining the number of clusters and the clustering method are solved simultaneously by choosing the best model, and the EM result provides a measure of uncertainty about the associated classification of each data point.
Gaussian parsimonious clustering models
- Computer SciencePattern Recognit.
- 1995
Model-based Methods of Classification: Using the mclust Software in Chemometrics
- Computer Science
- 2007
Due to recent advances in methods and software for model-based clustering, and to the interpretability of the results, clustering procedures based on probability models are increasingly preferred…
Incremental Model-Based Clustering for Large Datasets With Small Clusters
- Computer Science
- 2005
An incremental approach for data that can be processed as a whole in memory is proposed, which is relatively efficient computationally and has the ability to find small clusters in large datasets.
Model-Based Clustering for Image Segmentation and Large Datasets via Sampling
- Computer ScienceJ. Classif.
- 2004
These experiments suggest that a stable method with better performance can be obtained with two straightforward modifications to the simple sampling method: several tentative models are identified from the sample instead of just one, and several EM steps are used rather than just one E step to classify the full data set.
Detecting features in spatial point processes with clutter via model-based clustering
- Mathematics
- 1998
Abstract We consider the problem of detecting features, such as minefields or seismic faults, in spatial point processes when there is substantial clutter. We use model-based clustering based on a…