Collaborative Group Learning
@inproceedings{Feng2020CollaborativeGL, title={Collaborative Group Learning}, author={Shaoxiong Feng and Hongshen Chen and Xuancheng Ren and Zhuoye Ding and Kan Li and Xu Sun}, booktitle={AAAI Conference on Artificial Intelligence}, year={2020} }
Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima. However, previous approaches typically struggle with drastically aggravated student homogenization when the number of students rises. In this paper, we propose Collaborative Group Learning, an efficient framework that aims to diversify the feature representation and conduct an effective regularization. Intuitively, similar to the human group study mechanism…
Figures and Tables from this paper
3 Citations
Decentralized Federated Learning via Mutual Knowledge Transfer
- Computer ScienceIEEE Internet of Things Journal
- 2022
The proposed Def-KT algorithm significantly outperforms the baseline DFL methods with model averaging, i.e., Combo and FullAvg, especially when the training data are not independent and identically distributed (non-IID) across different clients.
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing
- Computer ScienceACL
- 2022
This work proposes a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO, which shares the weights of bottom layers across all models and applies different perturbations to the hidden representations for different models to effectively promote the model diversity.
References
SHOWING 1-10 OF 33 REFERENCES
Online Knowledge Distillation with Diverse Peers
- Computer ScienceAAAI
- 2020
Experimental results show that the proposed framework consistently gives better performance than state-of-the-art approaches without sacrificing training or inference complexity, demonstrating the effectiveness of the proposed two-level distillation framework.
Deep Mutual Learning
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
Surprisingly, it is revealed that no prior powerful teacher network is necessary - mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.
Collaborative Learning for Deep Neural Networks
- Computer ScienceNeurIPS
- 2018
The empirical results on CIFAR and ImageNet datasets demonstrate that deep neural networks learned as a group in a collaborative way significantly reduce the generalization error and increase the robustness to label noise.
FitNets: Hints for Thin Deep Nets
- Computer ScienceICLR
- 2015
This paper extends the idea of a student network that could imitate the soft output of a larger teacher network or ensemble of networks, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student.
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
- Computer ScienceICML
- 2020
Generative Teaching Networks may represent a first step toward the ambitious goal of algorithms that generate their own training data and, in doing so, open a variety of interesting new research questions and directions.
A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
A novel technique for knowledge transfer, where knowledge from a pretrained deep neural network (DNN) is distilled and transferred to another DNN, which shows the student DNN that learns the distilled knowledge is optimized much faster than the original model and outperforms the original DNN.
Variational Information Distillation for Knowledge Transfer
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
An information-theoretic framework for knowledge transfer is proposed which formulates knowledge transfer as maximizing the mutual information between the teacher and the student networks and which consistently outperforms existing methods.
Knowledge Distillation by On-the-Fly Native Ensemble
- Computer ScienceNeurIPS
- 2018
This work presents an On-the-fly Native Ensemble strategy for one-stage online distillation that improves the generalisation performance a variety of deep neural networks more significantly than alternative methods on four image classification dataset.
Convergent Learning: Do different neural networks learn the same representations?
- Computer ScienceFE@NIPS
- 2015
This paper investigates the extent to which neural networks exhibit convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces.
Random Path Selection for Continual Learning
- Computer ScienceNeurIPS
- 2019
This paper proposes a random path selection algorithm, called RPS-Net, that progressively chooses optimal paths for the new tasks while encouraging parameter sharing and reuse and proposes a simple controller to dynamically balance the model plasticity.