Corpus ID: 221954280

Finite mixture models do not reliably learn the number of components

@inproceedings{Cai2021FiniteMM,
  title={Finite mixture models do not reliably learn the number of components},
  author={Diana Cai and Trevor Campbell and Tamara Broderick},
  booktitle={ICML},
  year={2021}
}
Scientists and engineers are often interested in learning the number of subpopulations (or components) present in a data set. A common suggestion is to use a finite mixture model (FMM) with a prior on the number of components. Past work has shown the resulting FMM component-count posterior is consistent; that is, the posterior concentrates on the true generating number of components. But existing results crucially depend on the assumption that the component likelihoods are perfectly specified… Expand

Figures from this paper

References

SHOWING 1-10 OF 87 REFERENCES
Convergence of latent mixing measures in finite and infinite mixture models
This paper studies convergence behavior of latent mixing measures that arise in finite and infinite mixture models, using transportation distances (i.e., Wasserstein metrics). The relationshipExpand
Finite Mixture and Markov Switching Models
TLDR
This book should help newcomers to the field to understand how finite mixture and Markov switching models are formulated, what structures they imply on the data, what they could be used for, and how they are estimated. Expand
On the Identifiability of Finite Mixtures
On posterior contraction of parameters and interpretability in Bayesian mixture modeling
We study posterior contraction behaviors for parameters of interest in the context of Bayesian mixture modeling, where the number of mixing components is unknown while the model itself may or may notExpand
Strong identifiability and optimal minimax rates for finite mixture estimation
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching andExpand
Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data
TLDR
This model is formulated as a hierarchical Bayesian mixture model with cell-specific scalings that aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals and shows identifiability and weak convergence guarantees of the method. Expand
On strong identifiability and convergence rates of parameter estimation in finite mixtures
Abstract: This paper studies identifiability and convergence behaviors for parameters of multiple types, including matrix-variate ones, that arise in finite mixtures, and the effects of model fittingExpand
A simple example of Dirichlet process mixture inconsistency for the number of components
TLDR
An elementary proof of this inconsistency is given in what is perhaps the simplest possible setting: a DPM with normal components of unit variance, applied to data from a "mixture" with one standard normal component. Expand
PROBABILITY AND MEASURE
Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions
We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approachExpand
...
1
2
3
4
5
...