• Corpus ID: 7703985

Dykstra's Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions

@inproceedings{Tibshirani2017DykstrasAA,
  title={Dykstra's Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions},
  author={Ryan J. Tibshirani},
  booktitle={NIPS},
  year={2017}
}
  • R. Tibshirani
  • Published in NIPS 12 May 2017
  • Computer Science, Mathematics
We study connections between Dykstra's algorithm for projecting onto an intersection of convex sets, the augmented Lagrangian method of multipliers or ADMM, and block coordinate descent. We prove that coordinate descent for a regularized regression problem, in which the (separable) penalty functions are seminorms, is exactly equivalent to Dykstra's algorithm applied to the dual problem. ADMM on the dual problem is also seen to be equivalent, in the special case of two sets, with one being a… 

Figures from this paper

Dual Extrapolation for Sparse Generalized Linear Models

It is shown that the dual iterates of a GLM exhibit a Vector AutoRegressive (VAR) behavior after sign identification, when the primal problem is solved with proximal gradient descent or cyclic coordinate descent.

Algorithms and software for projections onto intersections of convex and non-convex sets with applications to inverse problems

Results show that the regularization of inverse problems in physical parameter estimation and image processing benefit from working with all available prior information and are not limited to one or two regularizers because of algorithmic, computational, or hyper-parameter selection issues.

Greed is good : greedy optimization methods for large-scale structured problems

This dissertation shows that greedy coordinate descent and Kaczmarz methods have efficient implementations and can be faster than their randomized counterparts for certain common problem structures in machine learning, and shows linear convergence for greedy (block) coordinate descent methods under a revived relaxation of strong convexity from 1963.

A Projection Method for Metric-Constrained Optimization

It is proved that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem, and a general solver is developed by generalizing and improving a simple projection algorithm originally developed for metricNearness.

An efficient implementable inexact entropic proximal point algorithm for a class of linear programming problems.

This work designs an implementable inexact entropic proximal point algorithm (iEPPA) combined with an easy-to-implement dual block coordinate descent method as a subsolver that is highly efficient and robust for solving large-scale CMOT problems.

Celer: a Fast Solver for the Lasso with Dual Extrapolation

This work proposes an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of improved dual points, which enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules.

Dual Extrapolation for Faster Lasso Solvers

This work proposes an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of an improved dual point, which enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules.

SDP-Based Bounds for the Quadratic Cycle Cover Problem via Cutting-Plane Augmented Lagrangian Methods and Reinforcement Learning

This paper studies the application of semidefinite programming (SDP) to obtain strong bounds for the QCCP and proposes a new approach in which an augmented Lagrangian method is incorporated into a cutting-plane framework by utilizing Dykstra’s projection algorithm.

Metric-Constrained Optimization for Graph Clustering Algorithms\ast

A new approach for solving linear programming relaxations of NP-hard graph clustering problems that enforce triangle inequality constraints on output variables, and proves that the linear programming relaxation of the correlation clustering objective is equivalent to a special case of a well-known problem in machine learning called metric nearness.

Distributed Deterministic Asynchronous Algorithms in Time-Varying Graphs Through Dykstra Splitting

  • C. Pang
  • Mathematics, Computer Science
    SIAM J. Optim.
  • 2019
This work considers the setting where each vertex of a graph has a function, and communications can only occur between vertices connected by an edge, and proposes a distributed version of Dykstra's algorithm to minimize the sum of these functions.

References

SHOWING 1-10 OF 60 REFERENCES

A cyclic projection algorithm via duality

We consider the problem of finding the projection of a given point in a Hilbert space onto the intersection of finitely many closed convex sets. A very simple iterative procedure was established by

On the convergence of the coordinate descent method for convex differentiable minimization

The coordinate descent method enjoys a long history in convex differentiable minimization. Surprisingly, very little is known about the convergence of the iterates generated by this method.

Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization

We study the convergence properties of a (block) coordinate descent method applied to minimize a nondifferentiable (nonconvex) function f(x1, . . . , xN) with certain separability and regularity

On the linear convergence of the alternating direction method of multipliers

This paper establishes the global R-linear convergence of the ADMM for minimizing the sum of any number of convex separable functions, assuming that a certain error bound condition holds true and the dual stepsize is sufficiently small.

PATHWISE COORDINATE OPTIMIZATION

It is shown that coordinate descent is very competitive with the well-known LARS procedure in large lasso problems, can deliver a path of solutions efficiently, and can be applied to many other convex statistical problems such as the garotte and elastic net.

Accelerated Alternating Direction Method of Multipliers

The Accelerated Alternating Direction Method of Multipliers (A2DM2) is introduced which solves problems with the same structure as ADMM but faster convergence rate and is shown to be competitive with the state-of-the-art specialized algorithms for the problem on both scalability and accuracy.

Parallel coordinate descent methods for big data optimization

In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex

The rate of convergence of dykstra's cyclic projections algorithm: The polyhedral case

Suppose K is the intersection of a finite number of closed half-spaces in a Hilbert space X. Starting with any point xeX, it is shown that the sequence of iterates {x n } generated by Dykstra's

A General Analysis of the Convergence of ADMM

This work provides a new proof of the linear convergence of the alternating direction method of multipliers when one of the objective terms is strongly convex, and demonstrates that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances.

Dykstras algorithm with bregman projections: A convergence proof

Dykstra’s algorithm and the method of cyclic Bregman projections are often employed to solve best approximation and convex feasibility problems, which are fundamental in mathematics and the physical
...