• Corpus ID: 195346954

ScreenerNet: Learning Curriculum for Neural Networks

@article{Kim2018ScreenerNetLC,
  title={ScreenerNet: Learning Curriculum for Neural Networks},
  author={Tae-Hoon Kim and Jonghyun Choi},
  journal={ArXiv},
  year={2018},
  volume={abs/1801.00904}
}
We propose to learn a curriculum or a syllabus for supervised learning with deep neural networks. Specifically, we learn weights for each sample in training by an attached neural network, called ScreenerNet, to the original network and jointly train them in an end-to-end fashion. We show the networks augmented with our ScreenerNet achieve early convergence with better accuracy than the state-of-the-art rule-based curricular learning methods in extensive experiments using three popular vision… 
DURING DEEP NEURAL NETWORK LEARNING
TLDR
It is found that certain examples are forgotten with high frequency, and some not at all; a data set’s (un)forgettable examples generalize across neural architectures; and a significant fraction of examples can be omitted from the training data set while still maintaining state-of-the-art generalization performance.
An Empirical Study of Example Forgetting during Deep Neural Network Learning
TLDR
It is found that certain examples are forgotten with high frequency, and some not at all; a data set’s (un)forgettable examples generalize across neural architectures; and a significant fraction of examples can be omitted from the training data set while still maintaining state-of-the-art generalization performance.
Online Adaptative Curriculum Learning for GANs
TLDR
Experimental results show that the proposed novel framework for training the generator against an ensemble of discriminator networks improves samples quality and diversity over existing baselines by effectively learning a curriculum and supports the claim that weaker discriminators have higher entropy improving modes coverage.
Increasing Robustness to Spurious Correlations using Forgettable Examples
TLDR
A new approach to robustify models by fine-tuning the authors' models twice, first on the full training data and second on the minorities only, and obtains substantial improvements in out-of-distribution generalization when applying this approach to the MNLI, QQP and FEVER datasets.
A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning
TLDR
A novel versatile adaptive curriculum learning (VACL) framework is presented, which presents a substantial step toward applying automatic curriculum learning on dialogue policy tasks and supports evaluating the difficulty of dialogue tasks only using the learning experiences of dialogue policy and skip-level selection according to their learning needs to maximize the learning efficiency.
Towards Realistic Predictors
TLDR
Experimental results provide evidence in support of the effectiveness of the proposed architecture and the learned hardness predictor, and show that the realistic classifier always improves performance on the examples that it accepts to classify, performing better on these examples than an equivalent nonrealistic classifier.
Técnicas de aprendizagem curricular aplicadas ao diagnóstico precoce do câncer de pele
Detectar precocemente o câncer de pele é crucial: a taxa de sobrevivência é muito alta — cerca de 95% — para o diagnóstico precoce, mas cai substancialmente — para 10% a 15% — se o câncer atingir
AutoAssist: A Framework to Accelerate Training of Deep Neural Networks
TLDR
It is demonstrated that AutoAssist reduces the number of epochs by 40% for training a ResNet to reach the same test accuracy on an image classification data set and saves 30% training time needed for a transformer model to yield the same BLEU scores on a translation dataset.

References

SHOWING 1-10 OF 25 REFERENCES
Curriculum learning
TLDR
It is hypothesized that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions).
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
TLDR
This work proposes a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead.
Online Batch Selection for Faster Training of Neural Networks
TLDR
This work investigates online batch selection strategies for two state-of-the-art methods of stochastic gradient-based optimization, AdaDelta and Adam, and proposes a simple strategy where all datapoints are ranked w.r.t. their latest known loss value and the probability to be selected decays exponentially as a function of rank.
Hindsight Experience Replay
TLDR
A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Understanding Black-box Predictions via Influence Functions
TLDR
This paper uses influence functions — a classic technique from robust statistics — to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction.
ADADELTA: An Adaptive Learning Rate Method
We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational
Self-Paced Learning for Latent Variable Models
TLDR
A novel, iterative self-paced learning algorithm where each iteration simultaneously selects easy samples and learns a new parameter vector that outperforms the state of the art method for learning a latent structural SVM on four applications.
Prioritized Experience Replay
TLDR
A framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently, in Deep Q-Networks, a reinforcement learning algorithm that achieved human-level performance across many Atari games.
...
...