Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity

  title={Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity},
  author={Dmitry Kovalev and Aleksandr Beznosikov and Ekaterina Borodich and Alexander V. Gasnikov and Gesualdo Scutari},
We study structured convex optimization problems, with additive objective r := p + q, where r is (μ-strongly) convex, q is Lq-smooth and convex, and p is Lpsmooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of p and q, that is, O( √ Lp/μ) and O( √ Lq/μ), respectively. This result is much sharper than the… 

Figures and Tables from this paper

Faster federated optimization under second-order similarity

A new analysis of the Stochastic Proximal Point Method (SPPM) is provided, which is simple, allows for approximate proximal point evaluations, does not require any smoothness assumptions, and shows a clear improvement in communication complexity over ordinary distributed stochastic gradient descent.

Compression and Data Similarity: Combination of Two Techniques for Communication-Efficient Solving of Distributed Variational Inequalities

This paper considers a combination of two popular approaches: compression and data similarity, and shows that this synergy can be moreective than each of the approaches separately in solving distributed smooth strongly monotone variational inequalities.

Smooth Monotone Stochastic Variational Inequalities and Saddle Point Problems - Survey

This paper is a survey of methods for solving smooth (strongly) monotone stochastic variational inequalities. To begin with, we give the deterministic foundation from which the stochastic methods

Exploiting higher-order derivatives in convex optimization methods

The winners are Dmitry Kamzolov (Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE), Alexander Gasnikov (MIPT, Moscow, Russia; IITP RAS, Moscow), Pavel Dvurechensky (WIAS, Berlin, Germany).



Optimal Decentralized Distributed Algorithms for Stochastic Convex Optimization.

This work considers stochastic convex optimization problems with affine constraints and develops several methods using either primal or dual approach to solve it, and develops convergence analysis for these methods for the unbiased and biased oracles respectively.

Acceleration in Distributed Optimization Under Similarity

Numerical results show significant communication savings with respect to existing accelerated distributed schemes, especially when solving ill-conditioned problems.

Accelerated gradient sliding for structured convex optimization

This paper presents an accelerated gradient sliding (AGS) method for minimizing the summation of two smooth convex functions with different Lipschitz constants, and shows that the AGS method can skip the gradient computation for one of these smooth components without slowing down the overall optimal rate of convergence.

On Convergence of Distributed Approximate Newton Methods: Globalization, Sharper Bounds and Beyond

A heavy-ball method is proposed to accelerate the convergence of DANE, showing that nearly tight local rate of convergence can be established for strongly convex functions, and with proper modification of algorithm the same result applies globally to linear prediction models.

Distributed Optimization Based on Gradient Tracking Revisited: Enhancing Convergence Rate via Surrogation

This work builds on the SONATA algorithm and achieves the first linear rate result for distributed composite optimization; it improves on existing schemes just minimizing $F, whose rate depends on much larger quantities than $\kappa_g$ (e.g., the worst-case condition number among the agents).

Conditional Gradient Sliding for Convex Optimization

The conditional gradient sliding (CGS) algorithm developed herein can skip the computation of gradients from time to time and, as a result, can achieve the optimal complexity bounds in terms of not only the number of calls to the $LO$ oracle but also thenumber of gradient evaluations.

Primal–dual accelerated gradient methods with small-dimensional relaxation oracle

It is demonstrated how in practice one can efficiently use the combination of line-search and primal-duality by considering a convex optimization problem with a simple structure (for example, linearly constrained).

Mirror-prox sliding methods for solving a class of monotone variational inequalities

By identifying the gradient components existing in the operator of VI, it is shown that it is possible to skip computations of the gradients from time to time, while still maintaining the optimal iteration complexity for solving these VI problems.

Gradient methods for minimizing composite functions

  • Y. Nesterov
  • Mathematics, Computer Science
    Math. Program.
  • 2013
In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two terms: one is smooth and given by a black-box oracle, and another is

On Accelerated Methods for Saddle-Point Problems with Composite Structure

This work considers strongly-convex-strongly-concave saddle-point problems with general non-bilinear objective and different condition numbers with respect to the primal and the dual variables and proposes a variance reduction algorithm with complexity estimates superior to the existing bounds in the literature.