• Corpus ID: 211132598

# Federated Learning with Matched Averaging

@article{Wang2020FederatedLW,
title={Federated Learning with Matched Averaging},
author={Hongyi Wang and Mikhail Yurochkin and Yuekai Sun and Dimitris Papailiopoulos and Yasaman Khazaeni},
journal={ArXiv},
year={2020},
volume={abs/2002.06440}
}
• Published 15 February 2020
• Computer Science
• ArXiv
Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA constructs the shared global model in a layer-wise manner by matching and averaging hidden elements…
256 Citations

## Figures and Tables from this paper

Fedns: Improving Federated Learning for Collaborative Image Classification on Mobile Clients
• Computer Science
2021 IEEE International Conference on Multimedia and Expo (ICME)
• 2021
This paper proposes a new approach, termed Federated Node Selection (FedNS), for the server’s global model aggregation in the FL setting, which filters and re-weights the clients’ models at the node/kernel level, leading to a potentially better global model by fusing the best components of the clients.
Quantization and Knowledge Distillation for Efficient Federated Learning on Edge Devices
• Computer Science
2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
• 2020
This work proposes an adaptive quantized federated average algorithm to reduce the communication cost by dynamically quantizing neural networks’ weights, and designs a federated knowledge distillation method to achieve high-quality small models with limited labeled data.
Model-Contrastive Federated Learning
• Computer Science
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2021
MOON is a simple and effective federated learning framework that utilizes the similarity between model representations to correct the local training of individual parties, i.e., conducting contrastive learning in model-level.
Federated matched averaging with information-gain based parameter sampling
• Computer Science
AIMLSystems
• 2021
A federated matched averaging algorithm with information-gain based sampling that considerably reduces the number of parameters to be sent by all clients in a federated learning paradigm.
FedProto: Federated Prototype Learning over Heterogeneous Devices
• Computer Science
ArXiv
• 2021
A novel federated prototype learning (FedProto) framework in which the devices and server communicate the class prototypes instead of the gradients, with FedProto outperforming several recent FL approaches on multiple datasets.
FedProc: Prototypical Contrastive Federated Learning on Non-IID data
• Computer Science
ArXiv
• 2021
FedProc is proposed, which is a simple and effective federated learning framework that designs a local network architecture and a global prototypical contrastive loss to regulate the training of local models, which makes local objectives consistent with the global optima.
Architecture Agnostic Federated Learning for Neural Networks
• Computer Science
ArXiv
• 2022
A novel Federated Heterogeneous Neural Networks (FedHeNN) framework that allows each client to build a personalised model without enforcing a common architecture across clients, and uses the instance-level representations obtained from peer clients to guide the simultaneous training on each client.
FedMix: Approximation of Mixup under Mean Augmented Federated Learning
• Computer Science
ICLR
• 2021
A new augmentation algorithm is proposed, named FedMix, which is inspired by a phenomenal yet simple data augmentation method, Mixup, but does not require local raw data to be directly shared among devices.
Algorithm 1 : Federated Averaging Algorithm Input
• Computer Science
• 2021
A novel Divide and Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm by over-coming the acknowledged FedAvg limitations in non-IID environments and achieves trained-model accuracy at-par with (and in certain cases exceeding) the numbers achieved by state-of-the-art algorithms like FedProx, FedMA, etc.
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
• Computer Science
ArXiv
• 2021
This paper demonstrates that self-attention-based architectures ( e.g, Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data and can greatly reduce catastrophic forgetting of previous devices, accel-erate convergence, and reach a better global model.

## References

SHOWING 1-10 OF 31 REFERENCES
LEAF: A Benchmark for Federated Settings
• Computer Science
ArXiv
• 2018
LEAF is proposed, a modular benchmarking framework for learning in federated settings that includes a suite of open-source federated datasets, a rigorous evaluation framework, and a set of reference implementations, all geared towards capturing the obstacles and intricacies of practical federated environments.
Bayesian Nonparametric Federated Learning of Neural Networks
• Computer Science
ICML
• 2019
A Bayesian nonparametric framework for federated learning with neural networks is developed that allows for a more expressive global network without additional supervision, data pooling and with as few as a single communication round.
• Computer Science
NIPS
• 2017
This work shows that multi-task learning is naturally suited to handle the statistical challenges of this setting, and proposes a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues.
On the Convergence of Federated Optimization in Heterogeneous Networks
• Computer Science
ArXiv
• 2018
This work proposes and introduces \fedprox, which is similar in spirit to \fedavg, but more amenable to theoretical analysis, and describes the convergence of \fed Prox under a novel \textit{device similarity} assumption.
Agnostic Federated Learning
• Computer Science
ICML
• 2019
This work proposes a new framework of agnostic federated learning, where the centralized model is optimized for any target distribution formed by a mixture of the client distributions, and shows that this framework naturally yields a notion of fairness.
Federated Optimization in Heterogeneous Networks
• Computer Science
MLSys
• 2020
This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.
Federated Learning: Challenges, Methods, and Future Directions
• Computer Science
IEEE Signal Processing Magazine
• 2020
The unique characteristics and challenges of federated learning are discussed, a broad overview of current approaches are provided, and several directions of future work that are relevant to a wide range of research communities are outlined.
Communication-Efficient Learning of Deep Networks from Decentralized Data
• Computer Science
AISTATS
• 2017
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.
Towards Federated Learning at Scale: System Design
• Computer Science
MLSys
• 2019
A scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow is built, describing the resulting high-level design, and sketch some of the challenges and their solutions.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
• Computer Science
ICML
• 2015
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.