# Optimization Transfer Using Surrogate Objective Functions

@article{Lange2000OptimizationTU, title={Optimization Transfer Using Surrogate Objective Functions}, author={Kenneth L. Lange and David R. Hunter and Ilsoon Yang}, journal={Journal of Computational and Graphical Statistics}, year={2000}, volume={9}, pages={1 - 20} }

Abstract The well-known EM algorithm is an optimization transfer algorithm that depends on the notion of incomplete or missing data. By invoking convexity arguments, one can construct a variety of other optimization transfer algorithms that do not involve missing data. These algorithms all rely on a majorizing or minorizing function that serves as a surrogate for the objective function. Optimizing the surrogate function drives the objective function in the correct direction. This article…

## 776 Citations

### Convexity , Surrogate Functions and Iterative Optimization in Multi-class Logistic Regression Models

- Computer Science
- 2004

A family of surrogate maximization algorithms for multi-class logistic regression models (also called conditional exponential models) is proposed, leading to the standard SM, generalized SM, gradient SM, and quadratic SM algorithms.

### A Tutorial on MM Algorithms

- Computer Science
- 2004

The principle behind MM algorithms is explained, some methods for constructing them are suggested, and some of their attractive features are discussed.

### Surrogate maximization/minimization algorithms and extensions

- Computer ScienceMachine Learning
- 2007

The usefulness of SM algorithms is demonstrated by taking logistic regression models, AdaBoost and the log-linear model as examples and devise several SM algorithms, including the standard SM, generalized SM, gradient SM, and quadratic SM algorithms.

### General A Tutorial on MM Algorithms

- Computer Science
- 2007

The principle behind MM algorithms is explained, some methods for constructing them are suggested, some of their attractive features are discussed, and new material on constrained optimization and standard error estimation is introduced.

### Majorization-Minimization algorithms for nonsmoothly penalized objective functions

- Computer Science
- 2010

A general class of algorithms for optimizing an extensive variety of nonsmoothly penalized objective functions that satisfy certain regularity conditions is proposed, and convergence theory is established, allowing for fast and stable updating that avoids the need for inverting high-dimensional matrices.

### Variable Selection using MM Algorithms.

- Computer ScienceAnnals of statistics
- 2005

This article proposes a new class of algorithms for finding a maximizer of the penalized likelihood for a broad class of penalty functions and proves that when these MM algorithms converge, they must converge to a desirable point.

### Surrogate maximization/minimization algorithms for AdaBoost and the logistic regression model

- Computer ScienceICML
- 2004

This paper solves the boosting problem by proposing SM algorithms for the corresponding optimization problem and derives an SM algorithm that can be shown to be identical to the algorithm proposed by Collins et al. (2002) based on Bregman distance.

### Generalized Majorization-Minimization for Non-Convex Optimization

- Computer ScienceIJCAI
- 2019

This work gives the first non-asymptotic convergence analysis for MM-alike algorithms in general non-convex optimization in general majorization-Minimization problems.

### Optimization with First-Order Surrogate Functions

- Computer ScienceICML
- 2013

A new incremental scheme is introduced that experimentally matches or outperforms state-of-the-art solvers for large-scale optimization problems typically arising in machine learning.

### Automatically Learning Compact Quality-aware Surrogates for Optimization Problems

- Computer ScienceNeurIPS
- 2020

By training a low-dimensional surrogate model end-to-end, and jointly with the predictive model, this work achieves a large reduction in training and inference time and improved performance by focusing attention on the more important variables in the optimization and learning in a smoother space.

## References

SHOWING 1-10 OF 111 REFERENCES

### EM algorithms without missing data

- Computer ScienceStatistical methods in medical research
- 1997

A theoretical perspective clarifies the operation of the EM algorithm and suggests novel generalizations that lead to highly stable algorithms with well-understood local and global convergence properties in medical statistics.

### Parameter expansion to accelerate EM : The PX-EM algorithm

- Computer Science
- 1997

This work proposes a parameter-expanded EM, PX-EM, algorithm, which shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis.

### Monotonicity of quadratic-approximation algorithms

- Mathematics
- 1988

It is desirable that a numerical maximization algorithm monotonically increase its objective function for the sake of its stability of convergence. It is here shown how one can adjust the…

### The Art of Data Augmentation

- Computer Science
- 2001

An effective search strategy is introduced that combines the ideas of marginal augmentation and conditional augmentation, together with a deterministic approximation method for selecting good augmentation schemes to obtain efficient Markov chain Monte Carlo algorithms for posterior sampling.

### Conjugate Gradient Acceleration of the EM Algorithm

- Computer Science
- 1993

The key, as it is shown, is that the EM step can be viewed as a generalized gradient, making it natural to apply generalized conjugate gradient methods in an attempt to accelerate the EM algorithm.

### A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms

- Computer Science
- 1990

Two modifications to the MCEM algorithm (the poor man's data augmentation algorithms), which allow for the calculation of the entire posterior, are presented and serve as diagnostics for the validity of the posterior distribution.

### An Optimization Transfer Algorithm for Quantile Regression

- Mathematics
- 1998

The q quantiles of an integrable random variable solve a minimization problem involving a certain expectation. This optimality principle suggests an algorithm for nding a sample quantile without…

### Monotonic algorithms for transmission tomography

- Computer ScienceIEEE Transactions on Medical Imaging
- 1999

The new algorithms are based on paraboloidal surrogate functions for the log likelihood, which lead to monotonic algorithms even for the nonconvex log likelihood that arises due to background events, such as scatter and random coincidences.

### Quantile Regression via an MM Algorithm

- Mathematics
- 2000

Abstract Quantile regression is an increasingly popular method for estimating the quantiles of a distribution conditional on the values of covariates. Regression quantiles are robust against the…

### An adaptive barrier method for convex programming

- Mathematics
- 1994

This paper presents a new barrier method for convex programming. The method involves an optimization transfer principle. Instead of minimizing the objective function f(x) directly, one minimizes the…