Curriculum learning

@inproceedings{Bengio2009CurriculumL,
  title={Curriculum learning},
  author={Yoshua Bengio and J. Louradour and Ronan Collobert and J. Weston},
  booktitle={ICML '09},
  year={2009}
}
Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones. Here, we formalize such training strategies in the context of machine learning, and call them "curriculum learning". In the context of recent research studying the difficulty of training in the presence of non-convex training criteria (for deep deterministic and stochastic neural networks), we explore… Expand
Curriculum Learning by Optimizing Learning Dynamics
TLDR
dynamics-optimized curriculum learning is introduced, which selects the training set at each step by a weighted sampling based on the scores of samples’ residual and linear temporal dynamics, and significantly outperforms random mini-batch SGD and recent curriculum learning methods both in terms of efficiency and final performance. Expand
Theory of Curriculum Learning, with Convex Loss Functions
TLDR
It is proved that when the ideal difficulty score is fixed, the convergence rate is monotonically increasing with respect to the loss of the current hypothesis at each point, and how these results bring to term two apparently contradicting heuristics: curriculum learning on the one hand, and hard data mining on the other. Expand
A new strategy for curriculum learning using model distillation
In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. Humans and animals learn much better when gradually presented in aExpand
CS 229 Final Project: Automated Curriculum Learning
(Motivation) A long-standing challenge in deep learning is accelerating training time, i.e. the time required for neural networks to learn near-optimal layer weights on complex datasets. It is wellExpand
Is Active Learning Always Beneficial? (Student Abstract)
This study highlights the limitations of automated curriculum learning, which may not be a viable strategy for tasks in which the benefits of the chosen curriculum are not apparent until much later.Expand
Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks
TLDR
It is proved that the rate of convergence of an ideal curriculum learning method in monotonally increasing with the difficulty of the examples, and that this increase in convergence rate is monotonically decreasing as training proceeds. Expand
Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft
TLDR
The results suggest that combining intra-episode and across-training exploration bonuses with learning progress creates a promising method for automated curriculum generation, which may substantially increase the ability to train more capable, generally intelligent agents. Expand
Improved Reinforcement Learning with Curriculum
TLDR
By employing an end-game-first training curriculum to train an AlphaZero inspired player, it is empirically show that the rate of learning of an artificial player can be improved during the early stages of training when compared to a player not using a training curriculum. Expand
An Analytical Theory of Curriculum Learning in Teacher-Student Networks
TLDR
This work analyses a prototypical neural network model of curriculum learning in the high-dimensional limit, employing statistical physics methods and shows that by connecting different learning phases through simple Gaussian priors, curriculum can yield a large improvement in test performance. Expand
Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning
TLDR
A two stage ACL approach is proposed where a teacher algorithm first learns to train a DRL agent with a high-exploration curriculum, and then distills learned priors from the first run to generate an "expert curriculum" to re-train the same agent from scratch. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
Flexible shaping: How learning in small steps helps
TLDR
This work studies the shaping of a hierarchical working memory task using an abstract neural network model as the target learner and uses the model to investigate some of the elements of successful shaping. Expand
The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training
TLDR
The experiments confirm and clarify the advantage of unsupervised pre- training, and empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples. Expand
Greedy Layer-Wise Training of Deep Networks
TLDR
These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization. Expand
Learning Deep Architectures for AI
TLDR
The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand
Learning and development in neural networks: the importance of starting small
  • J. Elman
  • Medicine, Psychology
  • Cognition
  • 1993
TLDR
Possible synergistic interactions between maturational change and the ability to learn a complex domain (language) as investigated in connectionist networks suggest that developmental restrictions on resources may constitute a necessary prerequisite for mastering certain complex domains. Expand
Explanation-based neural network learning a lifelong learning approach
From the Publisher: Lifelong learning addresses situations in which a learner faces a series of different learning tasks providing the opportunity for synergy among them. Explanation-based neuralExpand
Language acquisition in the absence of explicit negative evidence: how important is starting small?
It is commonly assumed that innate linguistic constraints are necessary to learn a natural language, based on the apparent lack of explicit negative evidence provided to children and on Gold's proofExpand
Active Learning with Statistical Models
TLDR
This work shows how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. Expand
Sparse Feature Learning for Deep Belief Networks
TLDR
This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations. Expand
A Fast Learning Algorithm for Deep Belief Nets
TLDR
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand
...
1
2
3
4
5
...