• Corpus ID: 8438120

Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks

@inproceedings{Scaman2017OptimalAF,
  title={Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks},
  author={Kevin Scaman and Francis R. Bach and S{\'e}bastien Bubeck and Yin Tat Lee and Laurent Massouli{\'e}},
  booktitle={ICML},
  year={2017}
}
In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision e > 0 in time O(√kg(1 + Δτ) ln(l/e)), where kg is the condition number of the (global) function to optimize, Δ is the diameter of the network, and τ (resp. l… 

Figures from this paper

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks
TLDR
This work designs two optimal algorithms, one of which is a variant of the recently proposed algorithm ADOM enhanced via a multi-consensus subroutine and a novel algorithm, called ADOM+, which is optimal in the case when access to the primal gradients is assumed.
Optimal Convergence Rates for Convex Distributed Optimization in Networks
TLDR
This work proposes a theoretical analysis of distributed optimization of convex functions using a network of computing units and provides a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function for non-smooth functions.
STRONGLY CONVEX DECENTRALIZED OPTIMIZATION OVER TIME-VARYING NETWORKS
TLDR
This work designs two optimal algorithms, one of which is a variant of the recently proposed algorithm ADOM enhanced via a multi-consensus subroutine and a novel algorithm, called ADOM+, which is optimal in the case when access to the primal gradients is assumed.
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks
TLDR
The error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions, and the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate are provided.
Optimal Algorithms for Distributed Optimization
TLDR
The results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix.
A dual approach for optimal algorithms in distributed optimization over networks
TLDR
This work studies dual-based algorithms for distributed convex optimization problems over networks, and proposes distributed algorithms that achieve the same optimal rates as their centralized counterparts (up to constant and logarithmic factors), with an additional optimal cost related to the spectral properties of the network.
Distributed Algorithms for Composite Optimization: Unified and Tight Convergence Analysis
TLDR
A by-product of this analysis is a tuning recommendation for several existing (non accelerated) distributed algorithms yielding the fastest provably (worst-case) convergence rate.
A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks
TLDR
This work proposes distributed algorithms that achieve the same optimal rates as their centralized counterparts (up to constant and logarithmic factors), with an additional optimal cost related to the spectral properties of the network.
Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization
TLDR
This work proposes two new algorithms for the task of decentralized minimization of the sum of smooth strongly convex functions stored across the nodes a network and presents them as accelerated variants of the Forward Backward algorithm to solve monotone inclusions associated with the decentralized optimization problem.
Decentralized Optimization with Heterogeneous Delays: a Continuous-Time Approach
TLDR
This paper proposes a novel continuous-time framework to analyze asynchronous algorithms, which does not require to define a global ordering of the events, and allows to finely characterize the time complexity in the presence of (heterogeneous) delays.
...
...

References

SHOWING 1-10 OF 29 REFERENCES
Fast Distributed Gradient Methods
TLDR
This work proposes two fast distributed gradient algorithms based on the centralized Nesterov gradient algorithm and establishes their convergence rates in terms of the per-node communications K and theper-node gradient evaluations k.
Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
TLDR
This work develops and analyze distributed algorithms based on dual subgradient averaging and provides sharp bounds on their convergence rates as a function of the network size and topology, and shows that the number of iterations required by the algorithm scales inversely in the spectral gap of thenetwork.
Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs
TLDR
This paper introduces a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient method and a gradient tracking technique that converges to a global and consensual minimizer over time-varying graphs.
Fast linear iterations for distributed averaging
TLDR
This work considers the problem of finding a linear iteration that yields distributed averaging consensus over a network, i.e., that asymptotically computes the average of some initial values given at the nodes, and gives several extensions and variations on the basic problem.
Randomized gossip algorithms
TLDR
This work analyzes the averaging problem under the gossip constraint for an arbitrary network graph, and finds that the averaging time of a gossip algorithm depends on the second largest eigenvalue of a doubly stochastic matrix characterizing the algorithm.
Linear Convergence Rate of a Class of Distributed Augmented Lagrangian Algorithms
We study distributed optimization where nodes cooperatively minimize the sum of their individual, locally known, convex costs fi(x)'s; x ϵ ℝd is global. Distributed augmented Lagrangian (AL) methods
On the Linear Convergence of the ADMM in Decentralized Consensus Optimization
TLDR
This paper establishes its linear convergence rate for the decentralized consensus optimization problem with strongly convex local objective functions in terms of the network topology, the properties ofLocal objective functions, and the algorithm parameter.
Distributed Newton Method for Large-Scale Consensus Optimization
TLDR
This paper proposes a distributed Newton method for decenteralized optimization of large sums of convex functions by utilizing a decomposition technique known as Global Consensus that distributes the computation across nodes of a graph and enforces a consensus constraint among the separated variables.
DSA: Decentralized Double Stochastic Averaging Gradient Algorithm
TLDR
The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on strong convexity of local functions and Lipschitz continuity of local gradients to guarantee linear convergence of the sequence generated by DSA in expectation.
A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization
TLDR
The exact second-order method (ESOM) is introduced here as an alternative that relies on a truncated Taylor's series to estimate the solution of the first-order condition imposed on the minimization of the quadratic approximation of the augmented Lagrangian.
...
...