Corpus ID: 220845850

Group Knowledge Transfer: Collaborative Training of Large CNNs on the Edge

  title={Group Knowledge Transfer: Collaborative Training of Large CNNs on the Edge},
  author={Chaoyang He and M. Annavaram and A. Avestimehr},
Scaling up the convolutional neural network (CNN) size (e.g., width, depth, etc.) is known to effectively improve model accuracy. However, the large model size impedes training on resource-constrained edge devices. For instance, federated learning (FL) on edge devices cannot tackle large CNN training demands, even though there is a strong practical need for FL due to its privacy and confidentiality properties. To address the resource-constrained reality, we reformulate FL as a group knowledge… Expand
Distributed Distillation for On-Device Learning
A distributed distillation algorithm where devices communicate and learn from soft-decision (softmax) outputs, which are inherently architecture-agnostic and scale only with the number of classes is introduced. Expand
Asynchronous Edge Learning using Cloned Knowledge Distillation
An asynchronous model-based communication method with knowledge distillation that can elastically handle the dynamic inflow and outflow of users in a service with minimal communication cost, operate with essentially no bottleneck due to user delay, and protect user's privacy is proposed. Expand
Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling
A federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. Expand
Federated learning using a mixture of experts
This paper proposes a federated learning framework using a mixture of experts to balance the specialist nature of a locally trained model with the generalist knowledge of a global model in a Federated learning setting, and shows that the mixture of Experts model is better suited as a personalized model for devices when data is heterogeneous, outperforming both global and local models. Expand
TornadoAggregate: Accurate and Scalable Federated Learning via the Ring-Based Architecture
This work proposes a novel algorithm called TornadoAggregate that improves both accuracy and scalability by facilitating the ring architecture and establishes three principles to reduce variance: Ring-Aware Grouping, Small Ring, and Ring Chaining. Expand
FedHome: Cloud-Edge based Personalized Federated Learning for In-Home Health Monitoring
FedHome is proposed, a novel cloud-edge based federated learning framework for in-home health monitoring, which learns a shared global model in the cloud from multiple homes at the network edges and achieves data privacy protection by keeping user data locally. Expand
A Latent Factor (PLS-SEM) Approach: Assessing the Determinants of Effective Knowledge Transfer
The results of PLS-SEM show positive and significant relationships of social interaction and training with knowledge transfer, while ICT shows an insignificant positive relationship with the knowledge transfer. Expand
Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning
This article proposes the first secure aggregation framework, named Turbo-Aggregate, which employs a multi-group circular strategy for efficient model aggregation, and leverages additive secret sharing and novel coding techniques for injecting aggregation redundancy in order to handle user dropouts while guaranteeing user privacy. Expand
Ensemble Distillation for Robust Model Fusion in Federated Learning
This work proposes ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients, which allows flexible aggregation over heterogeneous client models that can differ e.g. in size, numerical precision or structure. Expand
FedML: A Research Library and Benchmark for Federated Machine Learning
FedML is introduced, an open research library and benchmark that facilitates the development of new federated learning algorithms and fair performance comparisons and can provide an efficient and reproducible means of developing and evaluating algorithms for the Federated learning research community. Expand


Knowledge Distillation by On-the-Fly Native Ensemble
This work presents an On-the-fly Native Ensemble strategy for one-stage online distillation that improves the generalisation performance a variety of deep neural networks more significantly than alternative methods on four image classification dataset. Expand
Collaborative Learning for Deep Neural Networks
The empirical results on CIFAR and ImageNet datasets demonstrate that deep neural networks learned as a group in a collaborative way significantly reduce the generalization error and increase the robustness to label noise. Expand
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet. Expand
Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data
Federated distillation (FD) is proposed, a distributed model training algorithm whose communication payload size is much smaller than a benchmark scheme, federated learning (FL), particularly when the model size is large. Expand
Communication-Efficient Learning of Deep Networks from Decentralized Data
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets. Expand
FitNets: Hints for Thin Deep Nets
This paper extends the idea of a student network that could imitate the soft output of a larger teacher network or ensemble of networks, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Expand
Federated Learning with Matched Averaging
This work proposes Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs and indicates that FedMA outperforms popular state-of-the-art federatedLearning algorithms on deep CNN and L STM architectures trained on real world datasets, while improving the communication efficiency. Expand
Large scale distributed neural network training through online distillation
This paper claims that online distillation is a cost-effective way to make the exact predictions of a model dramatically more reproducible and can still speed up training even after the authors have already reached the point at which additional parallelism provides no benefit for synchronous or asynchronous stochastic gradient descent. Expand
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
This paper finds 99.9% of the gradient exchange in distributed SGD is redundant, and proposes Deep Gradient Compression (DGC) to greatly reduce the communication bandwidth, which enables large-scale distributed training on inexpensive commodity 1Gbps Ethernet and facilitates distributedTraining on mobile. Expand
signSGD: compressed optimisation for non-convex problems
SignSGD can get the best of both worlds: compressed gradients and SGD-level convergence rate, and the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models. Expand