• Corpus ID: 229924240

PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction

@article{Ye2020PMGTVRAD,
  title={PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction},
  author={Haishan Ye and Wei Xiong and Tong Zhang},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.15010}
}
This paper considers the decentralized composite optimization problem. We propose a novel decentralized variance reduction proximal-gradient algorithmic framework, called PMGT-VR, which is based on a combination of several techniques including multi-consensus, gradient tracking, and variance reduction. The proposed framework relies on an imitation of centralized algorithms and we demonstrate that algorithms under this framework achieve convergence rates similar to that of their centralized… 

Figures from this paper

Graph topology invariant gradient and sampling complexity for decentralized and stochastic optimization
TLDR
New algorithms whose gradient and sampling complexities are graph topology invariant, while their communication complexities remain optimal, are proposed and are independent of the network structure.
Decentralized Stochastic Variance Reduced Extragradient Method
TLDR
A novel decentralized optimization algorithm, called multi-consensus stochastic variance reduced extragradient, is proposed, which achieves the best known stoChastic first-order oracle (SFO) complexity for this problem.
Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks
TLDR
This paper transforms the decentralized algorithm into a centralized inexact proximal gradient algorithm with variance reduction, and proves that DPSVRG converges at the rate of O(1/T ) for general convex objectives plus a non-smooth term with T as the number of iterations, while DSPG convergence rate is retarded by the variance of stochastic gradients.

References

SHOWING 1-10 OF 45 REFERENCES
Variance-Reduced Decentralized Stochastic Optimization With Accelerated Convergence
TLDR
A novel algorithmic framework to minimize a finite-sum of functions available over a network of nodes that is stochastic and decentralized, and thus is particularly suitable for problems where large-scale, potentially private data cannot be collected or processed at a centralized server.
A Decentralized Proximal-Gradient Method With Network Independent Step-Sizes and Separated Convergence Rates
TLDR
This paper proposes a novel proximal-gradient algorithm for a decentralized optimization problem with a composite objective containing smooth and nonsmooth terms that is as good as one of the two convergence rates that match the typical rates for the general gradient descent and the consensus averaging.
A Proximal Gradient Algorithm for Decentralized Composite Optimization
TLDR
A proximal gradient exact first-order algorithm (PG-EXTRA) that utilizes the composite structure and has the best known convergence rate and is a nontrivial extension to the recent algorithm EXTRA.
Multi-consensus Decentralized Accelerated Gradient Descent
TLDR
A novel algorithm is proposed that can achieve near optimal communication complexity, matching the known lower bound up to a logarithmic factor of the condition number of the problem.
Decentralized Proximal Gradient Algorithms With Linear Convergence Rates
TLDR
A general primal-dual algorithmic framework that unifies many existing state-of-the-art algorithms is proposed that establishes linear convergence of the proposed method to the exact minimizer in the presence of the nonsmooth term.
A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization
TLDR
This work designs a proximal gradient decentralized algorithm whose fixed point coincides with the desired minimizer and provides a concise proof that establishes its linear convergence.
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
TLDR
This paper studies a D-PSGD algorithm and provides the first theoretical analysis that indicates a regime in which decentralized algorithms might outperform centralized algorithms for distributed stochastic gradient descent.
DSA: Decentralized Double Stochastic Averaging Gradient Algorithm
TLDR
The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on strong convexity of local functions and Lipschitz continuity of local gradients to guarantee linear convergence of the sequence generated by DSA in expectation.
Exact Diffusion for Distributed Optimization and Learning—Part I: Algorithm Development
TLDR
The exact diffusion method is applicable to locally balanced left-stochastic combination matrices which, compared to the conventional doubly stochastic matrix, are more general and able to endow the algorithm with faster convergence rates, more flexible step-size choices, and improved privacy-preserving properties.
Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks
TLDR
A Proximal Primal-Dual Algorithm (Prox-PDA), which enables the network nodes to distributedly and collectively compute the set of first-order stationary solutions in a global sublinear manner in a rate of O(1/r), where r is the iteration counter.
...
...