# Fast Incremental Expectation Maximization for finite-sum optimization: nonasymptotic convergence

@article{Fort2020FastIE, title={Fast Incremental Expectation Maximization for finite-sum optimization: nonasymptotic convergence}, author={Gersende Fort and P. Gach and {\'E}ric Moulines}, journal={Stat. Comput.}, year={2020}, volume={31}, pages={48} }

Fast Incremental Expectation Maximization (FIEM) is a version of the EM framework for large datasets. In this paper, we first recast FIEM and other incremental EM type algorithms in the Stochastic Approximation within EM framework. Then, we provide nonasymptotic bounds for the convergence in expectation as a function of the number of examples n and of the maximal number of iterations Kmax. We propose two strategies for achieving an ǫ-approximate stationary point, respectively with Kmax = O(n…

## 5 Citations

### Geom-Spider-EM: Faster Variance Reduced Stochastic Expectation Maximization for Nonconvex Finite-Sum Optimization

- Computer ScienceICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2021

This paper proposes an extension of the Stochastic Path-Integrated Differential EstimatoR EM (SPIDER-EM) and derives complexity bounds for this novel algorithm, designed to solve smooth nonconvex finite-sum optimization problems.

### An online Minorization-Maximization algorithm

- Computer Science
- 2022

It is shown that an online version of the Minorization–Maximization (MM) algorithm, which in-cludes the online EM algorithm as a special case, can be constructed in a similar manner.

### Federated Expectation Maximization with heterogeneity mitigation and variance reduction

- Computer ScienceArXiv
- 2021

FedEM is a new communication method, which handles partial participation of local devices, and is robust to heterogeneous distributions of the datasets, and develops and analyzes an extension of FedEM to further incorporate a variance reduction scheme.

### The Perturbed Prox-Preconditioned Spider Algorithm for EM-Based Large Scale Learning

- Computer Science2021 IEEE Statistical Signal Processing Workshop (SSP)
- 2021

The 3P-SPIDER algorithm addresses many intractabilities of the E-step of EM; it also deals with non-smooth regularization and convex constraint set and discusses the role of some design parameters.

### The Perturbed Prox-Preconditioned Spider Algorithm: Non-Asymptotic Convergence Bounds

- Computer Science, Mathematics2021 IEEE Statistical Signal Processing Workshop (SSP)
- 2021

A novel algorithm named PerturbedProx-Preconditioned SPIDER (3P-SPIDER) is introduced. It is a stochastic variancereduced proximal-gradient type algorithm built on Stochastic Path Integral…

## References

SHOWING 1-10 OF 42 REFERENCES

### On the Global Convergence of (Fast) Incremental Expectation Maximization Methods

- Computer ScienceNeurIPS
- 2019

This paper analyzes incremental and stochastic version of the EM algorithm as well as the variance reduced-version of [Chen et al., 2018] in a common unifying framework and establishes non-asymptotic convergence bounds for global convergence.

### A Lower Bound for the Optimization of Finite Sums

- Computer ScienceICML
- 2015

A lower bound for optimizing a finite sum of n functions, where each function is L-smooth and the sum is µ-strongly convex is presented, and upper bounds for recently developed methods specializing to this setting are compared.

### Stochastic Expectation Maximization with Variance Reduction

- Computer ScienceNeurIPS
- 2018

It is shown that sEM-vr has the same exponential asymptotic convergence rate as batch EM, and only requires a constant step size to achieve this rate, which alleviates the burden of parameter tuning.

### Minimizing finite sums with the stochastic average gradient

- Computer ScienceMath. Program.
- 2017

Numerical experiments indicate that the new SAG method often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

### Mini-batch learning of exponential family finite mixture models

- Computer ScienceStat. Comput.
- 2020

It is demonstrated that the mini-batch algorithm for mixtures of normal distributions can outperform the standard EM algorithm, and a scheme for the stochastic stabilization of the constructedmini-batch algorithms is proposed.

### Convergence Theorems for Generalized Alternating Minimization Procedures

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2005

This work studies EM variants in which the E-step is not performed exactly, either to obtain improved rates of convergence, or due to approximations needed to compute statistics under a model family over which E-steps cannot be realized.

### Non-asymptotic Analysis of Biased Stochastic Approximation Scheme

- Computer ScienceCOLT
- 2019

This work analyzes a general SA scheme to minimize a non-convex, smooth objective function, and illustrates these settings with the online EM algorithm and the policy-gradient method for average reward maximization in reinforcement learning.

### Convergence of the Monte Carlo expectation maximization for curved exponential families

- Mathematics
- 2003

The Monte Carlo expectation maximization (MCEM) algorithm is a versatile tool for inference in incomplete data models, especially when used in combination with Markov chain Monte Carlo simulation…

### On the choice of the number of blocks with the incremental EM algorithm for the fitting of normal mixtures

- Computer ScienceStat. Comput.
- 2003

A simple rule is proposed for choosing the number of blocks with the IEM algorithm in the extreme case of one observation per block, which provides efficient updating formulas, which avoid the direct calculation of the inverses and determinants of the component-covariance matrices.

### Generalized Majorization-Minimization

- Computer ScienceICML
- 2019

This work derives G-MM algorithms for several latent variable models and shows empirically that they consistently outperform their MM counterparts in optimizing non-convex objectives, and appears to be less sensitive to initialization.