• Corpus ID: 245634212

Efficient and Reliable Overlay Networks for Decentralized Federated Learning

@article{Hua2021EfficientAR,
  title={Efficient and Reliable Overlay Networks for Decentralized Federated Learning},
  author={Yifan Hua and Kevin Miller and A. Bertozzi and Chen Qian and Bao Wang},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.15486}
}
We propose near-optimal overlay networks based on d-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical convergence and generalization bounds for DFL. As such, our proposed overlay networks accelerate… 

Figures from this paper

Joint Consensus Matrix Design and Resource Allocation for Decentralized Learning
TLDR
A novel algorithm termed Communication-Efficient Network Topology (CENT), which reduces the latency in each training iteration by removing unnecessary communication links and preserves the training convergence rate while enforcing communication graph sparsity and avoiding selecting poor communication links is proposed.

References

SHOWING 1-10 OF 55 REFERENCES
Federated Optimization in Heterogeneous Networks
TLDR
This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.
Throughput-Optimal Topology Design for Cross-Silo Federated Learning
TLDR
This paper defines the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit and proposes practical algorithms that find a topology with the largest throughput or with provable throughput guarantees.
FedSplit: An algorithmic framework for fast federated optimization
TLDR
FedSplit is introduced, a class of algorithms based on operator splitting procedures for solving distributed convex minimization with additive structure and theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
Decentralized Federated Averaging
TLDR
The analysis of this paper is much more challenging than previous decentralized (momentum) SGD or FedAvg, and proves convergence of the (quantized) DFedAvgM under trivial assumptions; the convergence rate can be improved to sublinear when the loss function satisfies the PŁ property.
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data
TLDR
This paper first explicitly characterize the behavior of the FedAvg algorithm, and shows that without strong and unrealistic assumptions on the problem structure, the algorithm can behave erratically for non-convex problems (e.g., diverge to infinity).
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling
TLDR
A novel algorithm MATCHA is proposed that uses matching decomposition sampling of the base topology to parallelize inter-worker information exchange so as to significantly reduce communication delay and communicates more frequently over critical links such that it can maintain the same convergence rate as vanilla decentralized SGD.
On the Convergence of FedAvg on Non-IID Data
TLDR
This paper analyzes the convergence of Federated Averaging on non-iid data and establishes a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
TLDR
This work obtains tight convergence rates for FedAvg and proves that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence, and proposes a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the ` client-drifts' in its local updates.
Jellyfish: Networking Data Centers Randomly
TLDR
Jellyfish is a high-capacity network interconnect which, by adopting a random graph topology, yields itself naturally to incremental expansion, and is more cost-efficient than a fat-tree.
A survey and comparison of peer-to-peer overlay network schemes
TLDR
A survey and comparison of various Structured and Unstructured P2P overlay networks is presented, categorize the various schemes into these two groups in the design spectrum, and discusses the application-level network performance of each group.
...
...