# Sinkhorn EM: An Expectation-Maximization algorithm based on entropic optimal transport

@article{Mena2020SinkhornEA, title={Sinkhorn EM: An Expectation-Maximization algorithm based on entropic optimal transport}, author={Gonzalo E. Mena and Amin Nejatbakhsh and E. Varol and Jonathan Niles-Weed}, journal={ArXiv}, year={2020}, volume={abs/2006.16548} }

We study Sinkhorn EM (sEM), a variant of the expectation maximization (EM) algorithm for mixtures based on entropic optimal transport. sEM differs from the classic EM algorithm in the way responsibilities are computed during the expectation step: rather than assign data points to clusters independently, sEM uses optimal transport to compute responsibilities by incorporating prior information about mixing weights. Like EM, sEM has a natural interpretation as a coordinate ascent procedure, which…

## 5 Citations

### A Wasserstein Minimax Framework for Mixed Linear Regression

- Computer ScienceICML
- 2021

An optimal transport-based framework for MLR problems, Wasserstein Mixed Linear Regression (WMLR), is proposed, which minimizes theWasserstein distance between the learned and target mixture regression models.

### LiMIIRL: Lightweight Multiple-Intent Inverse Reinforcement Learning

- Computer ScienceArXiv
- 2021

This work presents a warm-start strategy based on up-front clustering of the demonstrations in feature space that produces a near-optimal reward ensemble, and proposes a MI-IRL performance metric that generalizes the popular Expected Value Difference measure to directly assesses learned rewards against the ground-truth reward ensemble.

### Probabilistic Joint Segmentation and Labeling of C. elegans Neurons

- Computer ScienceMICCAI
- 2020

A variation of the EM algorithm called Sinkhorn-EM (sEM) that uses regularized optimal transport Sink horn iterations to enforce constraints on the marginals of the joint distribution of observed variables and latent assignments in order to incorporate prior information about cell sizes into the cluster-data assignment proportions.

### Toward a more accurate 3D atlas of C. elegans neurons

- BiologybioRxiv
- 2021

The most complete full-body C. elegans 3D positional neuron atlas, incorporating positional variability derived from at least 7 animals per neuron, is released for the purposes of cell-type identity prediction for myriad applications.

### GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models

- Computer ScienceArXiv
- 2022

GMMSeg is a new family of segmentation models that rely on a dense generative classifier for the joint distribution p ( pixel feature, class) and outperforms the discriminative counterparts on three closed-set datasets.

## References

SHOWING 1-10 OF 34 REFERENCES

### Sinkhorn Distances: Lightspeed Computation of Optimal Transport

- Computer ScienceNIPS
- 2013

This work smooths the classic optimal transport problem with an entropic regularization term, and shows that the resulting optimum is also a distance which can be computed through Sinkhorn's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers.

### On‐line expectation–maximization algorithm for latent data models

- Computer Science
- 2009

A generic on‐line version of the expectation–maximization (EM) algorithm applicable to latent variable models of independent observations that is suitable for conditional models, as illustrated in the case of the mixture of linear regressions model.

### Convergence Theorems for Generalized Alternating Minimization Procedures

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2005

This work studies EM variants in which the E-step is not performed exactly, either to obtain improved rates of convergence, or due to approximations needed to compute statistics under a model family over which E-steps cannot be realized.

### Ten Steps of EM Suffice for Mixtures of Two Gaussians

- Computer Science, MathematicsCOLT
- 2017

This work shows that the population version of EM, where the algorithm is given access to infinitely many samples from the mixture, converges geometrically to the correct mean vectors, and provides simple, closed-form expressions for the convergence rate.

### Statistical guarantees for the EM algorithm: From population to sample-based analysis

- Computer Science, MathematicsArXiv
- 2014

A general framework for proving rigorous guarantees on the performance of the EM algorithm and a variant known as gradient EM and consequences of the general theory for three canonical examples of incomplete-data problems are developed.

### Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration

- Computer ScienceNIPS
- 2017

This paper demonstrates that general optimal transport distances can be approximated in near-linear time by Cuturi's Sinkhorn Distances, and directly suggests a new greedy coordinate descent algorithm, Greenkhorn, with the same theoretical guarantees.

### Variational Inference: A Review for Statisticians

- Computer ScienceArXiv
- 2016

Variational inference (VI), a method from machine learning that approximates probability densities through optimization, is reviewed and a variant that uses stochastic optimization to scale up to massive data is derived.

### Computational Optimal Transport

- GeologyFound. Trends Mach. Learn.
- 2019

This short book reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications.

### Differentiable Deep Clustering with Cluster Size Constraints

- Computer ScienceArXiv
- 2019

Empirical evaluations on image classification benchmarks suggest that compared to state-of-the-art methods, the optimal transport-based approach provide better unsupervised accuracy and does not require a pre-training phase.

### On Convergence Properties of the EM Algorithm for Gaussian Mixtures

- Computer ScienceNeural Computation
- 1996

The mathematical connection between the Expectation-Maximization (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite gaussian mixtures is built up and an explicit expression for the matrix is provided.