Contextual Model Aggregation for Fast and Robust Federated Learning in Edge Computing

  title={Contextual Model Aggregation for Fast and Robust Federated Learning in Edge Computing},
  author={Hung T. Nguyen and H. Vincent Poor and Mung Chiang},
Federated learning is a prime candidate for distributed machine learning at the network edge due to the low communication complexity and privacy protection among other attractive properties. However, existing algorithms face issues with slow convergence and/or robustness of performance due to the considerable heterogeneity of data distribution, computation and communication capability at the edge. In this work, we tackle both of these issues by focusing on the key component of model aggregation… 

Figures from this paper



Adaptive Federated Learning in Resource Constrained Edge Computing Systems

This paper analyzes the convergence bound of distributed gradient descent from a theoretical point of view, and proposes a control algorithm that determines the best tradeoff between local update and global parameter aggregation to minimize the loss function under a given resource budget.

Two Timescale Hybrid Federated Learning with Cooperative D2D Local Model Aggregations

An adaptive control algorithm is developed that tunes the step size, D2D communication rounds, and global aggregation period of TT-HF over time to target a sublinear convergence rate of O(1/t) while minimizing network resource utilization.

Fast-Convergent Federated Learning

A fast-convergent federated learning algorithm, called <inline-formula>, which performs intelligent sampling of devices in each round of model training to optimize the expected convergence speed and experimentally show its improvement in trained model accuracy, convergence speed, and/or model stability across various machine learning tasks and datasets.

Device Sampling for Heterogeneous Federated Learning: Theory, Algorithms, and Implementation

A sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and resulting offloading that maximizes FedL accuracy is developed.

Federated Optimization in Heterogeneous Networks

This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.

Personalized Federated Learning with Moreau Envelopes

This work proposes an algorithm for personalized FL (pFedMe) using Moreau envelopes as clients' regularized loss functions, which help decouple personalized model optimization from the global model learning in a bi-level problem stylized for personalizedFL.

On the Convergence of FedAvg on Non-IID Data

This paper analyzes the convergence of Federated Averaging on non-iid data and establishes a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.

Network-Aware Optimization of Distributed Learning for Fog Computing

This work analytically characterize the optimal data transfer solution for different fog network topologies, showing for example that the value of a device offloading is approximately linear in the range of computing costs in the network.

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

This work obtains tight convergence rates for FedAvg and proves that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence, and proposes a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the ` client-drifts' in its local updates.

Communication-Efficient Learning of Deep Networks from Decentralized Data

This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.