# Subgradient methods for huge-scale optimization problems

@article{Nesterov2014SubgradientMF,
title={Subgradient methods for huge-scale optimization problems},
author={Yurii Nesterov},
journal={Mathematical Programming},
year={2014},
volume={146},
pages={275-297}
}
• Y. Nesterov
• Published 1 August 2014
• Mathematics, Computer Science
• Mathematical Programming
We consider a new class of huge-scale problems, the problems with sparse subgradients. The most important functions of this type are piece-wise linear. For optimization problems with uniform sparsity of corresponding linear operators, we suggest a very efficient implementation of subgradient iterations, which total cost depends logarithmically in the dimension. This technique is based on a recursive update of the results of matrix/vector products and the values of symmetric functions. It works…

## Figures, Tables, and Topics from this paper

A SUBGRADIENT METHODS FOR NON-SMOOTH VECTOR OPTIMIZATION PROBLEMS
Vector optimization problems are a significant extension of scalar optimization and have wide range of application in various fields of economics, decision theory, game theory, information theory and
Parallel coordinate descent methods for big data optimization
• Mathematics, Computer Science
Math. Program.
• 2016
In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex
On the behavior of first-order penalty methods for conic constrained convex programming when Lagrange multipliers do not exist
• Mathematics, Computer Science
2015 54th IEEE Conference on Decision and Control (CDC)
• 2015
This work derives the iteration complexity of the classical quadratic penalty method, where the corresponding penalty regularized formulation of the original problem is solved using Nesterov's fast gradient algorithm.
On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum
• Computer Science, Mathematics
Autom. Remote. Control.
• 2019
A subgradient method of minimization, similar to the method of minimal iterations for solving systems of equations, which inherits from the latter convergence properties on quadratic functions, and it is proved that on some classes of functions it converges at the rate of a geometric progression.
Primal-Dual Subgradient Method for Huge-Scale Linear Conic Problems
• Mathematics, Computer Science
SIAM J. Optim.
• 2014
The main assumption is that the primal cone is formed as a direct product of many small-dimensional convex cones and that the matrix of the corresponding linear operator is uniformly sparse, which can approximate the primal-dual optimal solution with accuracy $O\big({1 \over \epsilon^2}\big)$ iterations.
Stochastic Block Mirror Descent Methods for Nonsmooth and Stochastic Optimization
• Mathematics, Computer Science
SIAM J. Optim.
• 2015
The rate of convergence of the SBMD method along with its associated large-deviation results for solving general nonsmooth and stochastic optimization problems and some of the results seem to be new for block coordinate descent methods for deterministic optimization.
• Computer Science, Mathematics
• 2020
This chapter is devoted to the blackbox subgradient algorithms with the minimal requirements for the storage of auxiliary results, which are necessary to execute these algorithms, and proposes two adaptive mirror descent methods which are optimal in terms of complexity bounds.
Randomized First-Order Methods for Saddle Point Optimization
• Mathematics
• 2014
In this paper, we present novel randomized algorithms for solving saddle point problems whose dual feasible region is given by the direct product of many convex sets. Our algorithms can achieve
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization
• Mathematics, Computer Science
ICML
• 2020
An interesting finding is that the optimal progression of precision across iterations is independent of the low-dimensional CM routine employed, suggesting a general framework for extending low- dimensional optimization routines to high-dimensional problems.
Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle
• Computer Science, Mathematics
J. Optim. Theory Appl.
• 2016
The first method is an extension of the Intermediate Gradient Method proposed by Devolder, Glineur and Nesterov for problems with deterministic inexact oracle and can be applied to problems with composite objective function, both deterministic and stochastic inexactness of the oracle, and allows using a non-Euclidean setup.

## References

SHOWING 1-10 OF 48 REFERENCES
Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems
• Y. Nesterov
• Mathematics, Computer Science
SIAM J. Optim.
• 2012
Surprisingly enough, for certain classes of objective functions, the proposed methods for solving huge-scale optimization problems are better than the standard worst-case bounds for deterministic algorithms.
Primal-dual subgradient methods for convex problems
• Y. Nesterov
• Computer Science, Mathematics
Math. Program.
• 2009
A new approach for constructing subgradient schemes for different types of nonsmooth problems with convex structure that is primal-dual since they are always able to generate a feasible approximation to the optimum of an appropriately formulated dual problem.
Stochastic first order methods in smooth convex optimization
In this paper, we are interested in the development of efficient first-order methods for convex optimization problems in the simultaneous presence of smoothness of the objective function and
Smooth minimization of non-smooth functions
• Y. Nesterov
• Mathematics, Computer Science
Math. Program.
• 2005
A new approach for constructing efficient schemes for non-smooth convex optimization is proposed, based on a special smoothing technique, which can be applied to functions with explicit max-structure, and can be considered as an alternative to black-box minimization.
Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function
• Mathematics, Computer Science
Math. Program.
• 2014
A randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function is developed and it is proved that it obtains an accurate solution with probability at least 1-\rho in at most O(n/\varepsilon) iterations, thus achieving first true iteration complexity bounds.
An Introduction to Optimization
• Mathematics
IEEE Antennas and Propagation Magazine
• 1996
Preface. MATHEMATICAL REVIEW. Methods of Proof and Some Notation. Vector Spaces and Matrices. Transformations. Concepts from Geometry. Elements of Calculus. UNCONSTRAINED OPTIMIZATION. Basics of
Characterizations of linear suboptimality for mathematical programs with equilibrium constraints
Based on robust generalized differential calculus, new results are derived giving pointwise necessary and sufficient conditions for linear suboptimality in general MPECs and its important specifications involving variational and quasivariational inequalities, implicit complementarity problems, etc.
Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization
• Mathematics, Computer Science
Neural Computation
• 2012
This letter proposes a simple way to significantly accelerate two well-known algorithms designed to solve NMF problems: the multiplicative updates of Lee and Seung and the hierarchical alternating least squares of Cichocki et al.
On the Convergence Rate of Dual Ascent Methods for Linearly Constrained Convex Minimization
• Mathematics, Computer Science
Math. Oper. Res.
• 1993
This study analyzes the rate of convergence of certain dual ascent methods for the problem of minimizing a strictly convex essentially smooth function subject to linear constraints and shows that, under mild assumptions on the problem, these methods attain a linear rate of converge.
Sufficient and necessary conditions for perpetual multi-assets exchange options
• Mathematics
• 2011
This paper considers the general problem of optimal timing of the exchange of the sum of n Ito-diffusions for the sum of m others (e.g., the optimal time to exchange a geometric Brownian motion for a