# Towards Accelerated Rates for Distributed Optimization over Time-Varying Networks

@inproceedings{Rogozin2021TowardsAR,
title={Towards Accelerated Rates for Distributed Optimization over Time-Varying Networks},
author={Alexander Rogozin and Vladislav Lukoshkin and Alexander V. Gasnikov and D. Kovalev and Egor Shulgin},
booktitle={OPTIMA},
year={2021}
}
• Published in OPTIMA 23 September 2020
• Computer Science
We study the problem of decentralized optimization over time-varying networks with strongly convex smooth cost functions. In our approach, nodes run a multi-step gossip procedure after making each gradient update, thus ensuring approximate consensus at each iteration, while the outer loop is based on accelerated Nesterov scheme. The algorithm achieves precision $\varepsilon > 0$ in $O(\sqrt{\kappa_g}\chi\log^2(1/\varepsilon))$ communication steps and $O(\sqrt{\kappa_g}\log(1/\varepsilon… An Accelerated Method For Decentralized Distributed Stochastic Optimization Over Time-Varying Graphs • Computer Science, Mathematics 2021 60th IEEE Conference on Decision and Control (CDC) • 2021 This work proposes the first accelerated (in the sense of Nesterov’s acceleration) method that simultaneously attains optimal up to a logarithmic factor communication and oracle complexity bounds for smooth strongly convex distributed stochastic optimization. Accelerated Gradient Tracking over Time-varying Graphs for Decentralized Optimization • Computer Science ArXiv • 2021 The widely used accelerated gradient tracking is revisits and extended to time-varying graphs and the dependence on the network connectivity constants can be further improved to O(1) and O( γ 1−σγ ) for the computation and communication complexities, respectively. Newton Method over Networks is Fast up to the Statistical Precision • Computer Science ICML • 2021 This work proposes a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph, and derives global complexity bounds for convex and strongly convex losses. ADOM: Accelerated Decentralized Optimization Method for Time-Varying Networks • Computer Science ICML • 2021 ADOM uses a dual oracle, i.e., it assumes access to the gradient of the Fenchel conjugate of the individual loss functions, and its communication complexity is the same as that of accelerated Nesterov gradient method (Nesterov, 2003). Recent theoretical advances in decentralized distributed convex optimization. • Computer Science • 2020 This paper focuses on how the results of decentralized distributed convex optimization can be explained based on optimal algorithms for the non-distributed setup, and provides recent results that have not been published yet. Parallel and Distributed algorithms for ML problems • Computer Science • 2020 A survey of modern parallel and distributed approaches to solve sum-type convex minimization problems come from ML applications is made. Inexact Tensor Methods and Their Application to Stochastic Convex Optimization • Computer Science • 2020 A general non-accelerated Tensor method under inexact information on higherorder derivatives is proposed, its convergence rate is analyzed, and sufficient conditions are provided for this method to have similar complexity as the exact tensor method. STRONGLY CONVEX DECENTRALIZED OPTIMIZATION OVER TIME-VARYING NETWORKS • Computer Science • 2021 This work designs two optimal algorithms, one of which is a variant of the recently proposed algorithm ADOM enhanced via a multi-consensus subroutine and a novel algorithm, called ADOM+, which is optimal in the case when access to the primal gradients is assumed. Decentralized Saddle-Point Problems with Different Constants of Strong Convexity and Strong Concavity • Computer Science, Mathematics • 2022 This paper study distributed saddle-point problems (SPP) with strongly-convex-strongly-concave smooth objectives that have different strong convexity and strong concavity parameters of composite terms, which correspond to min and max variables, and bilinear saddle- point part. Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks • Computer Science NeurIPS • 2021 This work designs two optimal algorithms, one of which is a variant of the recently proposed algorithm ADOM enhanced via a multi-consensus subroutine and a novel algorithm, called ADOM+, which is optimal in the case when access to the primal gradients is assumed. ## References SHOWING 1-10 OF 37 REFERENCES Optimal Accelerated Variance Reduced EXTRA and DIGing for Strongly Convex and Smooth Decentralized Optimization • Computer Science ArXiv • 2020 The famous EXTRA and DIGing methods with accelerated variance reduction (VR) are extended, and two methods, which require the time of stochastic gradient evaluations and communication rounds to reach precision$\epsilon", are proposed.
A Sharp Convergence Rate Analysis for Distributed Accelerated Gradient Methods
• Computer Science
• 2018
Two algorithms based on the framework of the accelerated penalty method with increasing penalty parameters are presented, which achieves the near optimal complexities for both computation and communication.
Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization
• Computer Science
• 2020
The widely used EXTRA and DIGing methods with variance reduction (VR) are extended, and the accelerated VR-EXTRA and VR-DIGing with both the optimal stochastic gradient computation complexity and communication complexity are proposed.
An Optimal Algorithm for Decentralized Finite Sum Optimization
• Computer Science
SIAM Journal on Optimization
• 2021
A lower bound of complexity is given to show that ADFS is optimal among decentralized algorithms, which uses local stochastic proximal updates and decentralized communications between nodes to derive ADFS.
Revisiting EXTRA for Smooth Distributed Optimization
• Computer Science, Mathematics
SIAM J. Optim.
• 2020
A sharp complexity analysis for EXTRA with the improved improved Catalyst framework is given and the strong convexity is absent and communication complexities of the accelerated EXTRA are only worse by the factors.
Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks
• Computer Science
ICML
• 2017
The efficiency of MSDA against state-of-the-art methods for two problems: least-squares regression and classification by logistic regression is verified.
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks
• Mathematics, Computer Science
NeurIPS
• 2018
The error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions, and the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate are provided.
Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs
• Mathematics, Computer Science
SIAM J. Optim.
• 2017
This paper introduces a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient method and a gradient tracking technique that converges to a global and consensual minimizer over time-varying graphs.