Optimization Transfer Using Surrogate Objective Functions

@article{Lange2000OptimizationTU,
  title={Optimization Transfer Using Surrogate Objective Functions},
  author={Kenneth L. Lange and David R. Hunter and Ilsoon Yang},
  journal={Journal of Computational and Graphical Statistics},
  year={2000},
  volume={9},
  pages={1 - 20}
}
Abstract The well-known EM algorithm is an optimization transfer algorithm that depends on the notion of incomplete or missing data. By invoking convexity arguments, one can construct a variety of other optimization transfer algorithms that do not involve missing data. These algorithms all rely on a majorizing or minorizing function that serves as a surrogate for the objective function. Optimizing the surrogate function drives the objective function in the correct direction. This article… 

Convexity , Surrogate Functions and Iterative Optimization in Multi-class Logistic Regression Models

TLDR
A family of surrogate maximization algorithms for multi-class logistic regression models (also called conditional exponential models) is proposed, leading to the standard SM, generalized SM, gradient SM, and quadratic SM algorithms.

A Tutorial on MM Algorithms

TLDR
The principle behind MM algorithms is explained, some methods for constructing them are suggested, and some of their attractive features are discussed.

Surrogate maximization/minimization algorithms and extensions

TLDR
The usefulness of SM algorithms is demonstrated by taking logistic regression models, AdaBoost and the log-linear model as examples and devise several SM algorithms, including the standard SM, generalized SM, gradient SM, and quadratic SM algorithms.

General A Tutorial on MM Algorithms

TLDR
The principle behind MM algorithms is explained, some methods for constructing them are suggested, some of their attractive features are discussed, and new material on constrained optimization and standard error estimation is introduced.

Majorization-Minimization algorithms for nonsmoothly penalized objective functions

TLDR
A general class of algorithms for optimizing an extensive variety of nonsmoothly penalized objective functions that satisfy certain regularity conditions is proposed, and convergence theory is established, allowing for fast and stable updating that avoids the need for inverting high-dimensional matrices.

Variable Selection using MM Algorithms.

TLDR
This article proposes a new class of algorithms for finding a maximizer of the penalized likelihood for a broad class of penalty functions and proves that when these MM algorithms converge, they must converge to a desirable point.

Surrogate maximization/minimization algorithms for AdaBoost and the logistic regression model

TLDR
This paper solves the boosting problem by proposing SM algorithms for the corresponding optimization problem and derives an SM algorithm that can be shown to be identical to the algorithm proposed by Collins et al. (2002) based on Bregman distance.

Generalized Majorization-Minimization for Non-Convex Optimization

TLDR
This work gives the first non-asymptotic convergence analysis for MM-alike algorithms in general non-convex optimization in general majorization-Minimization problems.

Optimization with First-Order Surrogate Functions

TLDR
A new incremental scheme is introduced that experimentally matches or outperforms state-of-the-art solvers for large-scale optimization problems typically arising in machine learning.

Automatically Learning Compact Quality-aware Surrogates for Optimization Problems

TLDR
By training a low-dimensional surrogate model end-to-end, and jointly with the predictive model, this work achieves a large reduction in training and inference time and improved performance by focusing attention on the more important variables in the optimization and learning in a smoother space.
...

References

SHOWING 1-10 OF 111 REFERENCES

EM algorithms without missing data

TLDR
A theoretical perspective clarifies the operation of the EM algorithm and suggests novel generalizations that lead to highly stable algorithms with well-understood local and global convergence properties in medical statistics.

Parameter expansion to accelerate EM : The PX-EM algorithm

TLDR
This work proposes a parameter-expanded EM, PX-EM, algorithm, which shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis.

Monotonicity of quadratic-approximation algorithms

It is desirable that a numerical maximization algorithm monotonically increase its objective function for the sake of its stability of convergence. It is here shown how one can adjust the

The Art of Data Augmentation

TLDR
An effective search strategy is introduced that combines the ideas of marginal augmentation and conditional augmentation, together with a deterministic approximation method for selecting good augmentation schemes to obtain efficient Markov chain Monte Carlo algorithms for posterior sampling.

Conjugate Gradient Acceleration of the EM Algorithm

TLDR
The key, as it is shown, is that the EM step can be viewed as a generalized gradient, making it natural to apply generalized conjugate gradient methods in an attempt to accelerate the EM algorithm.

A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms

TLDR
Two modifications to the MCEM algorithm (the poor man's data augmentation algorithms), which allow for the calculation of the entire posterior, are presented and serve as diagnostics for the validity of the posterior distribution.

An Optimization Transfer Algorithm for Quantile Regression

The q quantiles of an integrable random variable solve a minimization problem involving a certain expectation. This optimality principle suggests an algorithm for nding a sample quantile without

Monotonic algorithms for transmission tomography

TLDR
The new algorithms are based on paraboloidal surrogate functions for the log likelihood, which lead to monotonic algorithms even for the nonconvex log likelihood that arises due to background events, such as scatter and random coincidences.

Quantile Regression via an MM Algorithm

Abstract Quantile regression is an increasingly popular method for estimating the quantiles of a distribution conditional on the values of covariates. Regression quantiles are robust against the

An adaptive barrier method for convex programming

This paper presents a new barrier method for convex programming. The method involves an optimization transfer principle. Instead of minimizing the objective function f(x) directly, one minimizes the
...