• Corpus ID: 73729264

Asynchronous Federated Optimization

  title={Asynchronous Federated Optimization},
  author={Cong Xie and Oluwasanmi Koyejo and Indranil Gupta},
Federated learning enables training on a massive number of edge devices. [] Key Result Empirical results show that the proposed algorithm converges fast and tolerates staleness.

Figures and Tables from this paper

Joint Topology and Computation Resource Optimization for Federated Edge Learning

A novel penalty-based successive convex approximation method is proposed to solve the mixed-integer nonlinear problem, which converges to a stationary point of the primal problem under mild conditions.

Unbounded Gradients in Federated Learning with Buffered Asynchronous Aggregation

  • Taha ToghaniCésar A. Uribe
  • Computer Science
    2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2022
A theoretical analysis of the convergence rate of the FedBuff algorithm for asynchronous federated learning is presented when heterogeneity in data, batch size, and delay are considered.

A Novel Framework for the Analysis and Design of Heterogeneous Federated Learning

This paper provides a general framework to analyze the convergence of federated optimization algorithms with heterogeneous local training progress at clients and proposes FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.

Accelerating Federated Edge Learning via Topology Optimization

A novel topology-optimized FEEL (TOFEL) scheme is proposed to tackle the heterogeneity issue in federated learning and to improve the communication-and-computation efficiency, and an efficient imitation-learning-based approach is seamlessly integrated into the TOFEL framework.

FedHe: Heterogeneous Models and Communication-Efficient Federated Learning

  • Chan Yun HinE. Ngai
  • Computer Science
    2021 17th International Conference on Mobility, Sensing and Networking (MSN)
  • 2021
This paper proposes a novel FL method, called FedHe, inspired by knowledge distillation, which can train heterogeneous models and support asynchronous training processes with significantly reduced communication overheads and model accuracy.

HADFL: Heterogeneity-aware Decentralized Federated Learning Framework

Compared with the traditional FL system, HADFL can relieve the central server’s communication pressure, efficiently utilize heterogeneous computing power, and can achieve a maximum speedup of 3.15x than decentralized-FedAvg and 4.68x than Pytorch distributed training scheme, respectively, with almost no loss of convergence accuracy.

Time-Triggered Federated Learning Over Wireless Networks

This paper presents a time-triggered FL algorithm (TT-Fed) over wireless networks, which is a generalized form of classic synchronous and asynchronous FL, and provides a thorough convergence analysis for TT-Fed.

Asynchronous Semi-Decentralized Federated Edge Learning for Heterogeneous Clients

This work investigates a novel semi-decentralized FEEL architecture where multiple edge servers collaborate to incorporate more data from edge devices in training, and proposes an asynchronous training algorithm to overcome the device heterogeneity in computational resources.

Efficient Federated Learning Algorithm for Resource Allocation in Wireless IoT Networks

A convergence upper bound is provided characterizing the tradeoff between convergence rate and global rounds, showing that a small number of active UEs per round still guarantees convergence and advocating the proposed FL algorithm for a paradigm shift in bandwidth-constrained learning wireless IoT networks.

Semi-Synchronous Federated Learning for Energy-Efficient Training and Accelerated Convergence in Cross-Silo Settings

A novel energy-efficient Semi-Synchronous Federated Learning protocol that mixes local models periodically with minimal idle time and fast convergence is introduced that significantly outperforms previous work in data and computationally heterogeneous environments.



Federated Optimization: Distributed Optimization Beyond the Datacenter

We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are distributed (unevenly) over an extremely large

Towards Federated Learning at Scale: System Design

A scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow is built, describing the resulting high-level design, and sketch some of the challenges and their solutions.

Federated Learning: Strategies for Improving Communication Efficiency

Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling.

Communication Efficient Distributed Machine Learning with the Parameter Server

An in-depth analysis of two large scale machine learning problems ranging from l1 -regularized logistic regression on CPUs to reconstruction ICA on GPUs, using 636TB of real data with hundreds of billions of samples and dimensions is presented.

Big data caching for networking: moving from cloud to edge

In order to cope with the relentless data tsunami in 5G wireless networks, current approaches such as acquiring new spectrum, deploying more BSs, and increasing nodes in mobile packet core networks

Communication-Efficient Learning of Deep Networks from Decentralized Data

This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.

Asynchronous Decentralized Parallel Stochastic Gradient Descent

This paper proposes an asynchronous decentralized stochastic gradient decent algorithm (AD-PSGD) satisfying all above expectations and is the first asynchronous algorithm that achieves a similar epoch-wise convergence rate as AllReduce-SGD, at an over 100-GPU scale.

Practical Secure Aggregation for Privacy-Preserving Machine Learning

This protocol allows a server to compute the sum of large, user-held data vectors from mobile devices in a secure manner, and can be used, for example, in a federated learning setting, to aggregate user-provided model updates for a deep neural network.

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

The API design and the system implementation of MXNet are described, and it is explained how embedding of both symbolic expression and tensor operation is handled in a unified fashion.

Asynchronous Stochastic Gradient Descent with Delay Compensation

The proposed algorithm is evaluated on CIFAR-10 and ImageNet datasets, and the experimental results demonstrate that DC-ASGD outperforms both synchronous SGD and asynchronous SGD, and nearly approaches the performance of sequential SGD.