Corpus ID: 14955348

Communication-Efficient Learning of Deep Networks from Decentralized Data

@inproceedings{McMahan2017CommunicationEfficientLO,
  title={Communication-Efficient Learning of Deep Networks from Decentralized Data},
  author={H. Brendan McMahan and Eider Moore and Daniel Ramage and Seth Hampson and Blaise Ag{\"u}era y Arcas},
  booktitle={AISTATS},
  year={2017}
}
Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. [...] Key Method We term this decentralized approach Federated Learning. We present a practical method for the federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets. These experiments demonstrate the approach is robust to the…Expand
Decentralized Deep Learning with Arbitrary Communication Compression
TLDR
The use of communication compression in the decentralized training context achieves linear speedup in the number of workers and supports higher compression than previous state-of-the art methods. Expand
Simple, Efficient and Convenient Decentralized Multi-task Learning for Neural Networks
TLDR
This paper proposes a novel learning method for neural networks that is decentralized, multitask, and keeps users’ data local, and evaluates its efficiency in different situations on various kind of neural networks, with different learning algorithms, thus demonstrating its benefits in terms of learning quality and convergence. Expand
Distributed Distillation for On-Device Learning
TLDR
A distributed distillation algorithm where devices communicate and learn from soft-decision (softmax) outputs, which are inherently architecture-agnostic and scale only with the number of classes is introduced. Expand
Distributed generation of privacy preserving data with user customization
TLDR
This work introduces a decoupling of the creation of a latent representation and the privatization of data that allows user-specific privatization to occur in a distributed setting with limited computation and minimal disturbance on the utility of the data. Expand
Collaborative Unsupervised Visual Representation Learning from Decentralized Data
TLDR
This work proposes a novel federated unsupervised learning framework, FedU, which outperforms training with only one party by over 5% and other methods by over 14% in linear and semi-supervised evaluation on non-IID data. Expand
Federated Learning: Strategies for Improving Communication Efficiency
TLDR
Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling. Expand
Federated Learning with Quantization Constraints
TLDR
This work identifies the unique characteristics associated with conveying trained models over rate-constrained channels, and characterize a suitable quantization scheme for such setups, and shows that combining universal vector quantization methods with FL yields a decentralized training system, which is both efficient and feasible. Expand
Neural Architecture Search over Decentralized Data
TLDR
This work presents FedNAS, a highly optimized framework for efficient federated NAS that fully exploits the key opportunity of insufficient model candidate re-training during the architecture search process, and incorporates three key optimizations: parallel candidates training on partial clients, early dropping candidates with inferior performance, and dynamic round numbers. Expand
Efficient Decentralized Deep Learning by Dynamic Model Averaging
TLDR
An extensive empirical evaluation validates major improvement of the trade-off between model performance and communication which could be beneficial for numerous decentralized learning applications, such as autonomous driving, or voice recognition and image classification on mobile phones. Expand
Generating private data with user customization
TLDR
This work first decouple the creation of a latent representation, and then privatize the data that allows user-specific privatization to occur in a setting with limited computation and minimal disturbance on the utility of the data. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 63 REFERENCES
Federated Learning: Strategies for Improving Communication Efficiency
TLDR
Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling. Expand
Large Scale Distributed Deep Networks
TLDR
This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training. Expand
Deep Learning with Differential Privacy
TLDR
This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality. Expand
Distributed Learning, Communication Complexity and Privacy
TLDR
General upper and lower bounds on the amount of communication needed to learn well are provided, showing that in addition to VC- dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. Expand
Privacy-preserving deep learning
TLDR
The unprecedented accuracy of deep learning methods has turned them into the foundation of new AI-based services on the Internet and commercial companies that collect user data on a large scale have been the main beneficiaries. Expand
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network. Expand
Revisiting Distributed Synchronous SGD
TLDR
It is demonstrated that a third approach, synchronous optimization with backup workers, can avoid asynchronous noise while mitigating for the worst stragglers and is empirically validated and shown to converge faster and to better test accuracies. Expand
Deep learning with Elastic Averaging SGD
TLDR
Experiments demonstrate that the new algorithm accelerates the training of deep architectures compared to DOWNPOUR and other common baseline approaches and furthermore is very communication efficient. Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
TLDR
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
Practical Secure Aggregation for Federated Learning on User-Held Data
TLDR
This work considers training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation protects each user's model gradient. Expand
...
1
2
3
4
5
...