# Curriculum learning

@inproceedings{Bengio2009CurriculumL, title={Curriculum learning}, author={Yoshua Bengio and J. Louradour and Ronan Collobert and J. Weston}, booktitle={ICML '09}, year={2009} }

Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones. Here, we formalize such training strategies in the context of machine learning, and call them "curriculum learning". In the context of recent research studying the difficulty of training in the presence of non-convex training criteria (for deep deterministic and stochastic neural networks), we explore… Expand

#### Supplemental Content

Presentation Slides

#### Paper Mentions

#### 2,652 Citations

Curriculum Learning by Optimizing Learning Dynamics

- Computer Science
- AISTATS
- 2021

dynamics-optimized curriculum learning is introduced, which selects the training set at each step by a weighted sampling based on the scores of samples’ residual and linear temporal dynamics, and significantly outperforms random mini-batch SGD and recent curriculum learning methods both in terms of efficiency and final performance. Expand

Theory of Curriculum Learning, with Convex Loss Functions

- Computer Science, Mathematics
- ArXiv
- 2018

It is proved that when the ideal difficulty score is fixed, the convergence rate is monotonically increasing with respect to the loss of the current hypothesis at each point, and how these results bring to term two apparently contradicting heuristics: curriculum learning on the one hand, and hard data mining on the other. Expand

A new strategy for curriculum learning using model distillation

- 2020

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. Humans and animals learn much better when gradually presented in a… Expand

CS 229 Final Project: Automated Curriculum Learning

- 2017

(Motivation) A long-standing challenge in deep learning is accelerating training time, i.e. the time required for neural networks to learn near-optimal layer weights on complex datasets. It is well… Expand

Is Active Learning Always Beneficial? (Student Abstract)

- Computer Science
- AAAI
- 2021

This study highlights the limitations of automated curriculum learning, which may not be a viable strategy for tasks in which the benefits of the chosen curriculum are not apparent until much later.… Expand

Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks

- Computer Science, Mathematics
- ICML
- 2018

It is proved that the rate of convergence of an ideal curriculum learning method in monotonally increasing with the difficulty of the examples, and that this increase in convergence rate is monotonically decreasing as training proceeds. Expand

Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft

- Computer Science, Mathematics
- ArXiv
- 2021

The results suggest that combining intra-episode and across-training exploration bonuses with learning progress creates a promising method for automated curriculum generation, which may substantially increase the ability to train more capable, generally intelligent agents. Expand

Improved Reinforcement Learning with Curriculum

- Computer Science, Mathematics
- Expert Syst. Appl.
- 2020

By employing an end-game-first training curriculum to train an AlphaZero inspired player, it is empirically show that the rate of learning of an artificial player can be improved during the early stages of training when compared to a player not using a training curriculum. Expand

An Analytical Theory of Curriculum Learning in Teacher-Student Networks

- Computer Science, Physics
- ArXiv
- 2021

This work analyses a prototypical neural network model of curriculum learning in the high-dimensional limit, employing statistical physics methods and shows that by connecting different learning phases through simple Gaussian priors, curriculum can yield a large improvement in test performance. Expand

Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning

- Computer Science, Mathematics
- ArXiv
- 2020

A two stage ACL approach is proposed where a teacher algorithm first learns to train a DRL agent with a high-exploration curriculum, and then distills learned priors from the first run to generate an "expert curriculum" to re-train the same agent from scratch. Expand

#### References

SHOWING 1-10 OF 49 REFERENCES

Flexible shaping: How learning in small steps helps

- Medicine, Psychology
- Cognition
- 2009

This work studies the shaping of a hierarchical working memory task using an abstract neural network model as the target learner and uses the model to investigate some of the elements of successful shaping. Expand

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

- Computer Science
- AISTATS
- 2009

The experiments confirm and clarify the advantage of unsupervised pre- training, and empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples. Expand

Greedy Layer-Wise Training of Deep Networks

- Computer Science
- NIPS
- 2006

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization. Expand

Learning Deep Architectures for AI

- Computer Science
- Found. Trends Mach. Learn.
- 2007

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand

Learning and development in neural networks: the importance of starting small

- Medicine, Psychology
- Cognition
- 1993

Possible synergistic interactions between maturational change and the ability to learn a complex domain (language) as investigated in connectionist networks suggest that developmental restrictions on resources may constitute a necessary prerequisite for mastering certain complex domains. Expand

Explanation-based neural network learning a lifelong learning approach

- Computer Science
- 1995

From the Publisher:
Lifelong learning addresses situations in which a learner faces a series of different learning tasks providing the opportunity for synergy among them. Explanation-based neural… Expand

Language acquisition in the absence of explicit negative evidence: how important is starting small?

- Medicine, Psychology
- Cognition
- 1999

It is commonly assumed that innate linguistic constraints are necessary to learn a natural language, based on the apparent lack of explicit negative evidence provided to children and on Gold's proof… Expand

Active Learning with Statistical Models

- Computer Science
- NIPS
- 1994

This work shows how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. Expand

Sparse Feature Learning for Deep Belief Networks

- Computer Science
- NIPS
- 2007

This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations. Expand

A Fast Learning Algorithm for Deep Belief Nets

- Computer Science, Medicine
- Neural Computation
- 2006

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand