# Subgradient methods for huge-scale optimization problems

@article{Nesterov2014SubgradientMF, title={Subgradient methods for huge-scale optimization problems}, author={Yurii Nesterov}, journal={Mathematical Programming}, year={2014}, volume={146}, pages={275-297} }

We consider a new class of huge-scale problems, the problems with sparse subgradients. The most important functions of this type are piece-wise linear. For optimization problems with uniform sparsity of corresponding linear operators, we suggest a very efficient implementation of subgradient iterations, which total cost depends logarithmically in the dimension. This technique is based on a recursive update of the results of matrix/vector products and the values of symmetric functions. It works…

## 82 Citations

A SUBGRADIENT METHODS FOR NON-SMOOTH VECTOR OPTIMIZATION PROBLEMS

- Mathematics
- 2015

Vector optimization problems are a significant extension of scalar optimization and have wide range of application in various fields of economics, decision theory, game theory, information theory and…

Parallel coordinate descent methods for big data optimization

- Mathematics, Computer ScienceMath. Program.
- 2016

In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex…

On the behavior of first-order penalty methods for conic constrained convex programming when Lagrange multipliers do not exist

- Mathematics, Computer Science2015 54th IEEE Conference on Decision and Control (CDC)
- 2015

This work derives the iteration complexity of the classical quadratic penalty method, where the corresponding penalty regularized formulation of the original problem is solved using Nesterov's fast gradient algorithm.

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

- Computer Science, MathematicsAutom. Remote. Control.
- 2019

A subgradient method of minimization, similar to the method of minimal iterations for solving systems of equations, which inherits from the latter convergence properties on quadratic functions, and it is proved that on some classes of functions it converges at the rate of a geometric progression.

Primal-Dual Subgradient Method for Huge-Scale Linear Conic Problems

- Mathematics, Computer ScienceSIAM J. Optim.
- 2014

The main assumption is that the primal cone is formed as a direct product of many small-dimensional convex cones and that the matrix of the corresponding linear operator is uniformly sparse, which can approximate the primal-dual optimal solution with accuracy $O\big({1 \over \epsilon^2}\big)$ iterations.

Stochastic Block Mirror Descent Methods for Nonsmooth and Stochastic Optimization

- Mathematics, Computer ScienceSIAM J. Optim.
- 2015

The rate of convergence of the SBMD method along with its associated large-deviation results for solving general nonsmooth and stochastic optimization problems and some of the results seem to be new for block coordinate descent methods for deterministic optimization.

Advances in Low-Memory Subgradient Optimization

- Computer Science, Mathematics
- 2020

This chapter is devoted to the blackbox subgradient algorithms with the minimal requirements for the storage of auxiliary results, which are necessary to execute these algorithms, and proposes two adaptive mirror descent methods which are optimal in terms of complexity bounds.

Randomized First-Order Methods for Saddle Point Optimization

- Mathematics
- 2014

In this paper, we present novel randomized algorithms for solving saddle point problems whose dual feasible region is given by the direct product of many convex sets. Our algorithms can achieve…

Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization

- Mathematics, Computer ScienceICML
- 2020

An interesting finding is that the optimal progression of precision across iterations is independent of the low-dimensional CM routine employed, suggesting a general framework for extending low- dimensional optimization routines to high-dimensional problems.

Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle

- Computer Science, MathematicsJ. Optim. Theory Appl.
- 2016

The first method is an extension of the Intermediate Gradient Method proposed by Devolder, Glineur and Nesterov for problems with deterministic inexact oracle and can be applied to problems with composite objective function, both deterministic and stochastic inexactness of the oracle, and allows using a non-Euclidean setup.

## References

SHOWING 1-10 OF 48 REFERENCES

Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems

- Mathematics, Computer ScienceSIAM J. Optim.
- 2012

Surprisingly enough, for certain classes of objective functions, the proposed methods for solving huge-scale optimization problems are better than the standard worst-case bounds for deterministic algorithms.

Primal-dual subgradient methods for convex problems

- Computer Science, MathematicsMath. Program.
- 2009

A new approach for constructing subgradient schemes for different types of nonsmooth problems with convex structure that is primal-dual since they are always able to generate a feasible approximation to the optimum of an appropriately formulated dual problem.

Stochastic first order methods in smooth convex optimization

- Mathematics
- 2011

In this paper, we are interested in the development of efficient first-order methods for convex optimization problems in the simultaneous presence of smoothness of the objective function and…

Smooth minimization of non-smooth functions

- Mathematics, Computer ScienceMath. Program.
- 2005

A new approach for constructing efficient schemes for non-smooth convex optimization is proposed, based on a special smoothing technique, which can be applied to functions with explicit max-structure, and can be considered as an alternative to black-box minimization.

Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function

- Mathematics, Computer ScienceMath. Program.
- 2014

A randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function is developed and it is proved that it obtains an accurate solution with probability at least 1-\rho in at most O(n/\varepsilon) iterations, thus achieving first true iteration complexity bounds.

An Introduction to Optimization

- MathematicsIEEE Antennas and Propagation Magazine
- 1996

Preface. MATHEMATICAL REVIEW. Methods of Proof and Some Notation. Vector Spaces and Matrices. Transformations. Concepts from Geometry. Elements of Calculus. UNCONSTRAINED OPTIMIZATION. Basics of…

Characterizations of linear suboptimality for mathematical programs with equilibrium constraints

- Mathematics, Computer ScienceMath. Program.
- 2009

Based on robust generalized differential calculus, new results are derived giving pointwise necessary and sufficient conditions for linear suboptimality in general MPECs and its important specifications involving variational and quasivariational inequalities, implicit complementarity problems, etc.

Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization

- Mathematics, Computer ScienceNeural Computation
- 2012

This letter proposes a simple way to significantly accelerate two well-known algorithms designed to solve NMF problems: the multiplicative updates of Lee and Seung and the hierarchical alternating least squares of Cichocki et al.

On the Convergence Rate of Dual Ascent Methods for Linearly Constrained Convex Minimization

- Mathematics, Computer ScienceMath. Oper. Res.
- 1993

This study analyzes the rate of convergence of certain dual ascent methods for the problem of minimizing a strictly convex essentially smooth function subject to linear constraints and shows that, under mild assumptions on the problem, these methods attain a linear rate of converge.

Sufficient and necessary conditions for perpetual multi-assets exchange options

- Mathematics
- 2011

This paper considers the general problem of optimal timing of the exchange of the sum of n Ito-diffusions for the sum of m others (e.g., the optimal time to exchange a geometric Brownian motion for a…