• Corpus ID: 211678094

@article{Reddi2021AdaptiveFO,
author={Sashank J. Reddi and Zachary B. Charles and Manzil Zaheer and Zachary Garrett and Keith Rush and Jakub Konecn{\'y} and Sanjiv Kumar and H. B. McMahan},
journal={ArXiv},
year={2021},
volume={abs/2003.00295}
}
• Published 29 February 2020
• Computer Science
• ArXiv
Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Due to the heterogeneity of the client datasets, standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have had notable success in combating such issues. In…
282 Citations

## Figures and Tables from this paper

FedCM: Federated Learning with Client-level Momentum
• Computer Science
ArXiv
• 2021
This paper proposes a new federated learning algorithm, Federated Averaging with Client-level Momentum (FedCM), to tackle problems of partial participation and client heterogeneity in real-world Federated learning applications.
Aggregation Delayed Federated Learning
• Computer Science
ArXiv
• 2021
This work proposes a new aggregation framework for federated learning by introducing redistribution rounds that delay the aggregation and shows that the proposed framework significantly improves the performance on non-IID data.
Federated Learning Based on Dynamic Regularization
• Computer Science
ICLR
• 2021
This work proposes a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round, using a dynamic regularizer for each device at each round.
FedCluster: Boosting the Convergence of Federated Learning via Cluster-Cycling
• Computer Science
2020 IEEE International Conference on Big Data (Big Data)
• 2020
It is shown that FedCluster with the devices implementing the local stochastic gradient descent (SGD) algorithm achieves a faster convergence rate than the conventional federated averaging (Fe) algorithm in the presence of device-level data heterogeneity.
Behavior Mimics Distribution: Combining Individual and Group Behaviors for Federated Learning
• Computer Science
IJCAI
• 2021
A novel Federated Learning algorithm (called IGFL), which leverages both Individual and Group behaviors to mimic distribution, thereby improving the ability to deal with heterogeneity, and can significantly improve the performance of existing federated learning methods.
Double Momentum SGD for Federated Learning
• Computer Science
ArXiv
• 2021
This work proposes a new SGD variant named as DOMO to improve the model performance in federated learning, where double momentum buffers are maintained, and introduces a novel server momentum fusion technique to coordinate the server and local momentum SGD.
Fine-tuning is Fine in Federated Learning
• Computer Science
ArXiv
• 2021
The theory suggests that Finetuned Federated Averaging (FTFA) are competitive with more sophisticated meta-learning and proximal-regularized approaches and are computationally more efficient than its competitors.
Debiasing Model Updates for Improving Personalized Federated Training
• Computer Science
ICML
• 2021
This work proposes gradient correction methods leveraging prior works, and explicitly de-bias the meta-model in the distributed heterogeneous data setting to learn personalized device models, and presents convergence guarantees of the method for strongly convex, convex and nonconvex meta objectives.
Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity
• Computer Science
ArXiv
• 2020
A novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients’ data to adaptively select the clients in order to speed up the learning procedure.
Semi-Synchronous Federated Learning
• Computer Science
ArXiv
• 2021
A novel Semi-Synchronous Federated Learning protocol that mixes local models periodically with minimal idle time and fast convergence is introduced that significantly outperforms previous work in data and computationally heterogeneous environments.

## References

SHOWING 1-10 OF 52 REFERENCES
SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning
• Computer Science
ArXiv
• 2019
A new Stochastic Controlled Averaging algorithm (SCAFFOLD) which uses control variates to reduce the drift between different clients and it is proved that the algorithm requires significantly fewer rounds of communication and benefits from favorable convergence guarantees.
Federated Optimization in Heterogeneous Networks
• Computer Science
MLSys
• 2020
This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.
LEAF: A Benchmark for Federated Settings
• Computer Science
ArXiv
• 2018
LEAF is proposed, a modular benchmarking framework for learning in federated settings that includes a suite of open-source federated datasets, a rigorous evaluation framework, and a set of reference implementations, all geared towards capturing the obstacles and intricacies of practical federated environments.
Adaptive Federated Learning in Resource Constrained Edge Computing Systems
• Computer Science
IEEE Journal on Selected Areas in Communications
• 2019
This paper analyzes the convergence bound of distributed gradient descent from a theoretical point of view, and proposes a control algorithm that determines the best tradeoff between local update and global parameter aggregation to minimize the loss function under a given resource budget.
Federated Learning: Challenges, Methods, and Future Directions
• Computer Science
IEEE Signal Processing Magazine
• 2020
The unique characteristics and challenges of federated learning are discussed, a broad overview of current approaches are provided, and several directions of future work that are relevant to a wide range of research communities are outlined.
Towards Federated Learning at Scale: System Design
• Computer Science
MLSys
• 2019
A scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow is built, describing the resulting high-level design, and sketch some of the challenges and their solutions.
On the Convergence of FedAvg on Non-IID Data
• Computer Science
ICLR
• 2020
This paper analyzes the convergence of Federated Averaging on non-iid data and establishes a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.
Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification
• Computer Science
ArXiv
• 2019
This work proposes a way to synthesize datasets with a continuous range of identicalness and provide performance measures for the Federated Averaging algorithm, and shows that performance degrades as distributions differ more, and proposes a mitigation strategy via server momentum.
On the Outsized Importance of Learning Rates in Local Update Methods
• Computer Science
ArXiv
• 2020
This work proves that for quadratic objectives, local update methods perform stochastic gradient descent on a surrogate loss function which it exactly characterize, and uses this theory to derive novel convergence rates for federated averaging that showcase this trade-off between the condition number of the surrogate loss and its alignment with the true loss function.
Communication-Efficient Learning of Deep Networks from Decentralized Data
• Computer Science
AISTATS
• 2017
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.