# Likelihood Landscape and Local Minima Structures of Gaussian Mixture Models

@article{Chen2020LikelihoodLA, title={Likelihood Landscape and Local Minima Structures of Gaussian Mixture Models}, author={Yudong Chen and Xumei Xi}, journal={ArXiv}, year={2020}, volume={abs/2009.13040} }

In this paper, we study the landscape of the population negative log-likelihood function of Gaussian Mixture Models with a general number of components. Due to nonconvexity, there exist multiple local minima that are not globally optimal, even when the mixture is well-separated. We show that all local minima share the same form of structure that partially identifies the component centers of the true mixture, in the sense that each local minimum involves a non-overlapping combination of fitting…

## 2 Citations

### Estimating Gaussian mixtures using sparse polynomial moment systems

- Computer Science, Mathematics
- 2021

This work presents an algorithm that performs parameter recovery, and therefore density estimation, for high dimensional Gaussian mixture models that scales linearly in the dimension.

### A Geometric Approach to $k$-means

- Computer Science
- 2022

This work proposes a general algorithmic framework for escaping undesirable local solutions and recovering the global solution (or the ground truth) of k-means clustering by alternating between the following two steps iteratively.

## References

SHOWING 1-10 OF 34 REFERENCES

### Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

- Computer Science, MathematicsNIPS
- 2016

It is established that a first-order variant of EM will not converge to strict saddle points almost surely, indicating that the poor performance of the first- order method can be attributed to the existence of bad local maxima rather than bad saddle points.

### Are There Local Maxima in the Infinite-Sample Likelihood of Gaussian Mixture Estimation?

- Computer Science, MathematicsCOLT
- 2007

Consider the problem of estimating the centers Open image in new window of a uniform mixture of unit-variance spherical Gaussians in Open image in new window ,
Open image in new window
(1)…

### Convergence of Gradient EM on Multi-component Mixture of Gaussians

- Computer ScienceNIPS
- 2017

The convergence properties of the gradient variant of Expectation-Maximization algorithm for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients are studied and a near-optimal local contraction radius is obtained.

### Strong identifiability and optimal minimax rates for finite mixture estimation

- EconomicsThe Annals of Statistics
- 2018

We study the rates of estimation of ﬁnite mixing distributions, that is, the parameters of the mixture. We prove that under some regularity and strong identiﬁability conditions, around a given mixing…

### Challenges with EM in application to weakly identifiable mixture models

- MathematicsArXiv
- 2019

This work demonstrates via simulation studies a broad range of over-specified mixture models for which the EM algorithm converges very slowly, both in one and higher dimensions, and reveals distinct regimes in the convergence behavior of EM as a function of the dimension $d.

### Ten Steps of EM Suffice for Mixtures of Two Gaussians

- Computer Science, MathematicsCOLT
- 2017

This work shows that the population version of EM, where the algorithm is given access to infinitely many samples from the mixture, converges geometrically to the correct mean vectors, and provides simple, closed-form expressions for the convergence rate.

### Benefits of over-parameterization with EM

- Computer ScienceNeurIPS
- 2018

It is proved that introducing the (statistically redundant) weight parameters enables EM to find the global maximizer of the log-likelihood starting from almost any initial mean parameters, whereas EM without this over-parameterization may very often fail.

### Singularity, misspecification and the convergence rate of EM

- Computer Science, MathematicsThe Annals of Statistics
- 2020

This work makes use of a careful form of localization in the associated empirical process, and develops a recursive argument to progressively sharpen the statistical rate of the EM algorithm in over-specified settings.

### Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression

- Computer Science, MathematicsCOLT
- 2019

It is shown here that EM converges for mixed linear regression with two components (it is known that it may fail to converge for three or more), and moreover that this convergence holds for random initialization.

### Global Convergence of Least Squares EM for Demixing Two Log-Concave Densities

- MathematicsNeurIPS
- 2019

It is demonstrated that Least Squares EM, a variant of the EM algorithm, converges to the true location parameter from a randomly initialized point, and this global convergence property is robust under model mis-specification.