# Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity

@article{Kovalev2022OptimalGS, title={Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity}, author={Dmitry Kovalev and Aleksandr Beznosikov and Ekaterina Borodich and Alexander V. Gasnikov and Gesualdo Scutari}, journal={ArXiv}, year={2022}, volume={abs/2205.15136} }

We study structured convex optimization problems, with additive objective r := p + q, where r is (μ-strongly) convex, q is Lq-smooth and convex, and p is Lpsmooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of p and q, that is, O( √ Lp/μ) and O( √ Lq/μ), respectively. This result is much sharper than the…

## 4 Citations

### Faster federated optimization under second-order similarity

- Computer Science
- 2022

A new analysis of the Stochastic Proximal Point Method (SPPM) is provided, which is simple, allows for approximate proximal point evaluations, does not require any smoothness assumptions, and shows a clear improvement in communication complexity over ordinary distributed stochastic gradient descent.

### Compression and Data Similarity: Combination of Two Techniques for Communication-Efficient Solving of Distributed Variational Inequalities

- Computer ScienceArXiv
- 2022

This paper considers a combination of two popular approaches: compression and data similarity, and shows that this synergy can be moreective than each of the approaches separately in solving distributed smooth strongly monotone variational inequalities.

### Smooth Monotone Stochastic Variational Inequalities and Saddle Point Problems - Survey

- MathematicsArXiv
- 2022

This paper is a survey of methods for solving smooth (strongly) monotone stochastic variational inequalities. To begin with, we give the deterministic foundation from which the stochastic methods…

### Exploiting higher-order derivatives in convex optimization methods

- Computer Science
- 2022

The winners are Dmitry Kamzolov (Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE), Alexander Gasnikov (MIPT, Moscow, Russia; IITP RAS, Moscow), Pavel Dvurechensky (WIAS, Berlin, Germany).

## References

SHOWING 1-10 OF 49 REFERENCES

### Optimal Decentralized Distributed Algorithms for Stochastic Convex Optimization.

- Computer Science, Mathematics
- 2019

This work considers stochastic convex optimization problems with affine constraints and develops several methods using either primal or dual approach to solve it, and develops convergence analysis for these methods for the unbiased and biased oracles respectively.

### Acceleration in Distributed Optimization Under Similarity

- Computer ScienceAISTATS
- 2022

Numerical results show signiﬁcant communication savings with respect to existing accelerated distributed schemes, especially when solving ill-conditioned problems.

### Accelerated gradient sliding for structured convex optimization

- Computer Science, MathematicsComput. Optim. Appl.
- 2022

This paper presents an accelerated gradient sliding (AGS) method for minimizing the summation of two smooth convex functions with different Lipschitz constants, and shows that the AGS method can skip the gradient computation for one of these smooth components without slowing down the overall optimal rate of convergence.

### On Convergence of Distributed Approximate Newton Methods: Globalization, Sharper Bounds and Beyond

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2020

A heavy-ball method is proposed to accelerate the convergence of DANE, showing that nearly tight local rate of convergence can be established for strongly convex functions, and with proper modification of algorithm the same result applies globally to linear prediction models.

### Distributed Optimization Based on Gradient Tracking Revisited: Enhancing Convergence Rate via Surrogation

- Computer ScienceSIAM Journal on Optimization
- 2022

This work builds on the SONATA algorithm and achieves the first linear rate result for distributed composite optimization; it improves on existing schemes just minimizing $F, whose rate depends on much larger quantities than $\kappa_g$ (e.g., the worst-case condition number among the agents).

### Conditional Gradient Sliding for Convex Optimization

- Computer ScienceSIAM J. Optim.
- 2016

The conditional gradient sliding (CGS) algorithm developed herein can skip the computation of gradients from time to time and, as a result, can achieve the optimal complexity bounds in terms of not only the number of calls to the $LO$ oracle but also thenumber of gradient evaluations.

### Primal–dual accelerated gradient methods with small-dimensional relaxation oracle

- Computer ScienceOptimization Methods and Software
- 2020

It is demonstrated how in practice one can efficiently use the combination of line-search and primal-duality by considering a convex optimization problem with a simple structure (for example, linearly constrained).

### Mirror-prox sliding methods for solving a class of monotone variational inequalities

- Mathematics, Computer Science
- 2021

By identifying the gradient components existing in the operator of VI, it is shown that it is possible to skip computations of the gradients from time to time, while still maintaining the optimal iteration complexity for solving these VI problems.

### Gradient methods for minimizing composite functions

- Mathematics, Computer ScienceMath. Program.
- 2013

In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two terms: one is smooth and given by a black-box oracle, and another is…

### On Accelerated Methods for Saddle-Point Problems with Composite Structure

- Computer Science, Mathematics
- 2021

This work considers strongly-convex-strongly-concave saddle-point problems with general non-bilinear objective and different condition numbers with respect to the primal and the dual variables and proposes a variance reduction algorithm with complexity estimates superior to the existing bounds in the literature.