• Corpus ID: 245634212

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

@article{Hua2021EfficientAR,
  title={Efficient and Reliable Overlay Networks for Decentralized Federated Learning},
  author={Yifan Hua and Kevin Miller and A. Bertozzi and Chen Qian and Bao Wang},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.15486}
}
We propose near-optimal overlay networks based on d-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical convergence and generalization bounds for DFL. As such, our proposed overlay networks accelerate… 

Figures from this paper

References

SHOWING 1-10 OF 56 REFERENCES
Federated Optimization in Heterogeneous Networks
TLDR
This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.
Throughput-Optimal Topology Design for Cross-Silo Federated Learning
TLDR
This paper defines the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit and proposes practical algorithms that find a topology with the largest throughput or with provable throughput guarantees.
Decentralized Federated Averaging
TLDR
This paper studies the decentralized FedAvg with momentum (DFedAvgM), which is implemented on clients that are connected by an undirected graph, and proves convergence of the (quantized) DFed AvgM under trivial assumptions; the convergence rate can be improved when the loss function satisfies the PŁ property.
FedSplit: An algorithmic framework for fast federated optimization
TLDR
FedSplit is introduced, a class of algorithms based on operator splitting procedures for solving distributed convex minimization with additive structure and theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
Adaptive Federated Optimization
TLDR
This work proposes federated versions of adaptive optimizers, including Adagrad, Adam, and Yogi, and analyzes their convergence in the presence of heterogeneous data for general nonconvex settings to highlight the interplay between client heterogeneity and communication efficiency.
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data
TLDR
This paper first explicitly characterize the behavior of the FedAvg algorithm, and shows that without strong and unrealistic assumptions on the problem structure, the algorithm can behave erratically for non-convex problems (e.g., diverge to infinity).
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling
TLDR
A novel algorithm MATCHA is proposed that uses matching decomposition sampling of the base topology to parallelize inter-worker information exchange so as to significantly reduce communication delay and communicates more frequently over critical links such that it can maintain the same convergence rate as vanilla decentralized SGD.
On the Convergence of FedAvg on Non-IID Data
TLDR
This paper analyzes the convergence of Federated Averaging on non-iid data and establishes a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.
Distributed construction of random expander networks
  • Ching Law, K. Siu
  • Computer Science
    IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428)
  • 2003
TLDR
A novel distributed algorithm for constructing random overlay networks that are composed of d Hamilton cycles is presented and is robust against an offline adversary selecting the sequence of the join and leave operations.
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
TLDR
This work obtains tight convergence rates for FedAvg and proves that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence, and proposes a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the ` client-drifts' in its local updates.
...
...