From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

  title={From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering},
  author={Sylvia Fr{\"u}hwirth-Schnatter and Gertraud Malsiner‐Walli},
  journal={Advances in Data Analysis and Classification},
  pages={33 - 64}
In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303–324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a… 

A Bayesian sparse finite mixture model for clustering data from a heterogeneous population

A Bayesian approach for clustering data using a sparse finite mixture model (SFMM) and a split-merge strategy is inserted within the algorithm in order to increase the mixing of the Markov chain in relation to the number of clusters.

Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data

An infinite mixture model framework, also known as Dirichlet process mixture model, is proposed for the mixtures of MNIG distributions and the number of components is inferred along with the parameter estimates in a Bayesian framework thus alleviating the need for model selection criteria.

Clustering multivariate data using factor analytic Bayesian mixtures with an unknown number of components

This work considers a set of eight parameterizations, giving rise to parsimonious representations of the covariance matrix per cluster, which are compared to similar models estimated using the expectation–maximization algorithm on simulated and real datasets.

Generalized Mixtures of Finite Mixtures and Telescoping Sampling

The novel telescoping sampler is proposed which allows Bayesian inference for mixtures with arbitrary component distributions without the need to resort to RJMCMC methods and is demonstrated on several data sets.

Infinite Mixtures of Infinite Factor Analysers

The IMIFA model obviates the need for model selection criteria, reduces the computational burden associated with the search of the model space, improves clustering performance by allowing cluster-specific numbers of factors, and quantifies uncertainty in the numbers of clusters and cluster- specific factors.

Gibbs sampling for mixtures in order of appearance: the ordered allocation sampler

This work derives a sampler that is straightforward to implement for mixing distributions with tractable size-biased ordered weights and mitigates the label-switching problem in infinite mixtures.

Variance matrix priors for Dirichlet process mixture models with Gaussian kernels

The results show that the choice of prior is critical for deriving reliable posterior inferences in problems of higher dimensionality, and the use of the DPMM in clustering is also applicable to density estimation.


A new class of priors is introduced: the Normalized Independent Point Process, which is based on an auxiliary variable MCMC, which allows handling the otherwise intractable posterior distribution and overcomes the challenges associated with the Reversible Jump algorithm.

Escaping the curse of dimensionality in Bayesian model based clustering

A Bayesian oracle for clustering is defined, with the oracle clustering posterior based on the true values of low-dimensional latent variables, and a class of LAtent Mixtures for Bayesian (Lamb) clustering that have equivalent behavior to this oracle as dimension grows are defined.

Dynamic mixtures of finite mixtures and telescoping sampling

A novel sampling scheme is proposed for MFMs called the telescoping sampler which allows Bayesian inference for mixtures with arbitrary component distributions and the ease of its application using different component distributions is demonstrated on real data sets.



Identifying Mixtures of Mixtures Using Bayesian Estimation

  • G. Malsiner‐WalliS. Frühwirth-SchnatterB. Grün
  • Computer Science
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2017
This work proposes a different approach based on sparse finite mixtures to achieve identifiability within the Bayesian framework, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at.

A simple example of Dirichlet process mixture inconsistency for the number of components

An elementary proof of this inconsistency is given in what is perhaps the simplest possible setting: a DPM with normal components of unit variance, applied to data from a "mixture" with one standard normal component.

Mixture Models With a Prior on the Number of Components

It turns out that many of the essential properties of DPMs are also exhibited by MFMs, and the MFM analogues are simple enough that they can be used much like the corresponding DPM properties; this simplifies the implementation of MFMs and can substantially improve mixing.

Model-based Gaussian and non-Gaussian clustering

The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967), but it is restricted to Gaussian distributions and it does not allow for noise.

Finite Mixture and Markov Switching Models

This book should help newcomers to the field to understand how finite mixture and Markov switching models are formulated, what structures they imply on the data, what they could be used for, and how they are estimated.

Slice sampling mixture models

A more efficient version of the slice sampler for Dirichlet process mixture models described by Walker allows for the fitting of infinite mixture models with a wide-range of prior specifications and considers priors defined through infinite sequences of independent positive random variables.

Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data

An improved method of auxiliary mixture sampling that uses a bounded number of latent variables per observation leads to a substantial increase in efficiency of Auxiliary mixture sampling for highly structured models.

Finite Mixture Models

The aim of this article is to provide an up-to-date account of the theory and methodological developments underlying the applications of finite mixture models.

Model-based clustering and classification with non-normal mixture distributions

This paper considers some of these existing proposals of multivariate non-normal mixture models and compares the relative performance of restricted and unrestricted skew mixture models in clustering, discriminant analysis, and density estimation on six real datasets from flow cytometry, finance, and image analysis.

Inference in model-based cluster analysis

This work proposes a new approach to cluster analysis which consists of exact Bayesian inference via Gibbs sampling, and the calculation of Bayes factors from the output using the Laplace–Metropolis estimator, which works well in several real and simulated examples.