# Distributed Heavy-Ball: A Generalization and Acceleration of First-Order Methods With Gradient Tracking

@article{Xin2020DistributedHA,
title={Distributed Heavy-Ball: A Generalization and Acceleration of First-Order Methods With Gradient Tracking},
author={Ran Xin and Usman A. Khan},
journal={IEEE Transactions on Automatic Control},
year={2020},
volume={65},
pages={2627-2633}
}
• Published 2020
• Mathematics, Computer Science
• IEEE Transactions on Automatic Control
We study distributed optimization to minimize a sum of smooth and strongly-convex functions. Recent work on this problem uses gradient tracking to achieve linear convergence to the exact global minimizer. However, a connection among different approaches has been unclear. In this paper, we first show that many of the existing first-order algorithms are related with a simple state transformation, at the heart of which lies a recently introduced algorithm known as <inline-formula><tex-math… Expand
Decentralized Optimization Over Time-Varying Directed Graphs With Row and Column-Stochastic Matrices
• Mathematics, Computer Science
• IEEE Transactions on Automatic Control
• 2020
A distributed optimization algorithm that minimizes a sum of convex functions over time-varying, random directed graphs that relies on a novel information mixing approach that exploits both row- and column-stochastic weights to achieve agreement toward the optimal solution when the underlying graph is directed. Expand
Asymptotic Properties of $\mathcal{S}$-$\mathcal{AB}$ Method with Diminishing Stepsize
• Shengchao Zhao, Yongchao Liu
• Mathematics
• 2021
The popular AB/push-pull method for distributed optimization problem may unify much of the existing decentralized first-order methods based on gradient tracking technique. More recently, theExpand
Distributed Nesterov Gradient Methods Over Arbitrary Graphs
• Computer Science, Mathematics
• IEEE Signal Processing Letters
• 2019
A distributed Nesterov gradient method is introduced that only requires row-stochastic weights, but at the expense of additional iterations for eigenvector estimation and achieves acceleration compared to the current state-of-the-art methods for distributed optimization. Expand
Distributed Adaptive Newton Methods with Globally Superlinear Convergence
• Computer Science, Mathematics
• ArXiv
• 2020
DAN and DAN-LA can globally achieve quadratic and superlinear convergence rates, respectively and are shown to show the advantages over existing methods. Expand
Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks
• Mathematics, Computer Science
• ArXiv
• 2019
A framework which allows to choose the stepsize and the momentum parameters of these algorithms in a way to optimize performance by systematically trading off the bias, variance, robustness to gradient noise and dependence to network effects is developed. Expand
Distributed stochastic optimization with gradient tracking over strongly-connected networks
• Computer Science, Mathematics
• 2019 IEEE 58th Conference on Decision and Control (CDC)
• 2019
It is shown that under a sufficiently small constant step-size, S - A - B converges linearly (in expected mean-square sense) to a neighborhood of the global minimizer. Expand
Primal–Dual Methods for Large-Scale and Distributed Convex Optimization and Data Analytics
• Computer Science, Mathematics
• Proceedings of the IEEE
• 2020
This work provides a tutorial-style introduction to ALM and its variants for solving convex optimization problems in large-scale and distributed settings, describes control-theoretic tools for the algorithms’ analysis and design, and provides novel insights into the context of two emerging applications: federated learning and distributed energy trading. Expand
Convergence Rate of Distributed Optimization Algorithms Based on Gradient Tracking
• Mathematics, Computer Science
• ArXiv
• 2019
This paper studies the convergence rate of SONATA, the first work proving a convergence rate (in particular, linear rate) for distributed algorithms applicable to such a general class of composite, constrained optimization problems over graphs. Expand
Double-Like Accelerated Distributed optimization Algorithm for Convex optimization Problem
• Computer Science
• 2020 10th International Conference on Information Science and Technology (ICIST)
• 2020
It is proved that DA-DOA can quickly and linearly find the optimal solution of the problem when the step size and momentum coefficient are small enough and positive and an explicit linear convergence rate is definitely shown. Expand
Distributed Nonlinear Estimation Over Unbalanced Directed Networks
• Mathematics, Computer Science
• IEEE Transactions on Signal Processing
• 2020
This paper investigates distributed nonlinear parameter estimation in unbalanced multi-agent networks, where individual agents sequentially make local, nonlinear, noisy measurements of the true butExpand

#### References

SHOWING 1-10 OF 66 REFERENCES
• Mathematics, Physics
• IEEE Transactions on Automatic Control
• 2020
This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by a sum of local functions, using only local computation andExpand
• Mathematics, Computer Science
• IEEE Transactions on Automatic Control
• 2018
The proposed algorithm, Accelerated Distributed Directed OPTimization (ADD-OPT), achieves the best known convergence rate for this class of problems, given strongly convex, objective functions with globally Lipschitz-continuous gradients. Expand
Performance of first-order methods for smooth convex minimization: a novel approach
• Computer Science, Mathematics
• Math. Program.
• 2014
A novel approach for analyzing the worst-case performance of first-order black-box optimization methods, which focuses on smooth unconstrained convex minimization over the Euclidean space and derives a new and tight analytical bound on its performance. Expand
DEXTRA: A Fast Algorithm for Optimization Over Directed Graphs
• Mathematics, Computer Science
• IEEE Transactions on Automatic Control
• 2017
A fast distributed algorithm, termed DEXTRA, to solve the optimization problem when agents reach agreement and collaboratively minimize the sum of their local objective functions over the network, where the communication between the agents is described by a directed graph. Expand
Distributed Nesterov Gradient Methods Over Arbitrary Graphs
• Computer Science, Mathematics
• IEEE Signal Processing Letters
• 2019
A distributed Nesterov gradient method is introduced that only requires row-stochastic weights, but at the expense of additional iterations for eigenvector estimation and achieves acceleration compared to the current state-of-the-art methods for distributed optimization. Expand
Distributed stochastic optimization with gradient tracking over strongly-connected networks
• Computer Science, Mathematics
• 2019 IEEE 58th Conference on Decision and Control (CDC)
• 2019
It is shown that under a sufficiently small constant step-size, S - A - B converges linearly (in expected mean-square sense) to a neighborhood of the global minimizer. Expand
Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs
• Mathematics, Computer Science
• SIAM J. Optim.
• 2017
This paper introduces a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient method and a gradient tracking technique that converges to a global and consensual minimizer over time-varying graphs. Expand
Linear Convergence in Optimization Over Directed Graphs With Row-Stochastic Matrices
• Computer Science, Mathematics
• IEEE Transactions on Automatic Control
• 2018
This paper considers a distributed optimization problem over a multiagent network, in which the objective function is a sum of individual cost functions at the agents, and proposes a algorithm that achieves the best known rate of convergence for this class of problems. Expand
Harnessing smoothness to accelerate distributed optimization
• Mathematics, Computer Science
• 2016 IEEE 55th Conference on Decision and Control (CDC)
• 2016
This paper proposes a distributed algorithm that, despite using the same amount of communication per iteration as DGD, can effectively harnesses the function smoothness and converge to the optimum with a rate of O(1/t) if the objective function is strongly convex and smooth. Expand