Curriculum Learning: A Survey

  title={Curriculum Learning: A Survey},
  author={Petru Soviany and Radu Tudor Ionescu and Paolo Rota and N. Sebe},
  journal={International Journal of Computer Vision},
  pages={1526 - 1565}
Training machine learning models in a meaningful order, from the easy samples to the hard ones, using curriculum learning can provide performance improvements over the standard training approach based on random data shuffling, without any additional computational costs. Curriculum learning strategies have been successfully employed in all areas of machine learning, in a wide range of tasks. However, the necessity of finding a way to rank the samples from easy to hard, as well as the right… 

CurML: A Curriculum Machine Learning Library

CurML is developed, the first Curriculum Machine L earning library to integrate existing CL algorithms into a unified framework, and is convenient to use and flexible to customize by calling the provided five APIs.

How to Teach: Learning Data-Free Knowledge Distillation from Curriculum

Experiments conducted on benchmark datasets show that with a simple course design strategy, CuDFKD achieves the best performance over state-of-the-art DFKD methods and different benchmarks, such as 95.28% top1 accuracy of the ResNet18 model on CIFAR10, which is better than training from scratch with data.

On the Role of Corpus Ordering in Language Modeling

Empirical results of training transformer language models on English corpus and evaluating it intrinsically as well as after fine-tuning across eight tasks from the GLUE benchmark, show consistent improvement gains over conventional vanilla training.

Task Factorization in Curriculum Learning

This paper identifies different types of factorizations common in the literature of curriculum learning for reinforcement learning tasks: factorizations that involve the agent, the environment, or the mission, and presents several case studies to showcase how leveraging an appropriate factorization can boost learning using a simple curriculum.

Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy

Experiments conducted on benchmark datasets show that with a simple course design strategy, CuDFKD achieves the best performance over state-of-the-art DFKD methods and different benchmarks, even better than training from scratch with data.

Curriculum Learning: A Regularization Method for Efficient and Stable Billion-Scale GPT Model Pre-Training

This work presents a novel sequence length warmup method that simultaneously improves training stability and efficiency, and exerts a gradient variance reduction effect and regularizes early stages of training where the amount of training data is much smaller than the model capacity.

Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text

A new CL method is proposed which gradually increases the block-size of input text for training the self-attention mechanism of BERT and its variants using the maximum available batch-size and outperforms the baseline in terms of convergence speed and final performance on downstream tasks.


This work proposes ZONE, a novel computational framework that operationalizes ZPD through the language of Bayesian probability theory, revealing that tasks should be selected by difficulty and learning progression and enforce the teacher to pick tasks within the student’s ZPD.

Curriculum learning for language modeling

The effect of curriculum learning on language model pretraining is explored using various linguistically motivated curricula and transfer performance on the GLUE Benchmark is evaluated.

CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS

Curriculum Learning On Sharing Extent is proposed, which can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies.



A Comprehensive Survey on Curriculum Learning

This article summarizes existing CL designs based on the general framework of Difficulty Measurer + Training Scheduler and further categorize the methodologies for automatic CL into four groups, i.e., Self-paced Learning, Transfer Teacher, RL Teacher, and Other Automatic CL.

Learning Curriculum Policies for Reinforcement Learning

The method is extended to handle multiple transfer learning algorithms, and it is shown for the first time that a curriculum policy over this MDP can be learned from experience.

Teacher–Student Curriculum Learning

We propose Teacher–Student Curriculum Learning (TSCL), a framework for automatic curriculum learning, where the Student tries to learn a complex task, and the Teacher automatically chooses subtasks

Curriculum Learning with Diversity for Supervised Computer Vision Tasks

A novel curriculum sampling strategy which takes into consideration the diversity of the training data together with the difficulty of the inputs, leading to faster convergence and more accurate results, when other curriculum-based strategies fail.

Meta Automatic Curriculum Learning

This work presents AGAIN, a first instantiation of Meta-ACL, and showcases its benefits for curriculum generation over classical ACL in multiple simulated environments including procedurally generated parkour environments with learners of varying morphologies.

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

This article presents a framework for curriculum learning (CL) in reinforcement learning, and uses it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.

Curriculum learning

It is hypothesized that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions).

Curriculum Learning by Dynamic Instance Hardness

This paper introduces dynamic instance hardness (DIH), the exponential moving average of a sample’s instantaneous hardness over the training history that indicates that a model retains knowledge about a sample over time, and implies a flat loss landscape for that sample.

Source Task Creation for Curriculum Learning

This paper presents the more ambitious problem of curriculum learning in reinforcement learning, in which the goal is to design a sequence of source tasks for an agent to train on, such that final performance or learning speed is improved.

On The Power of Curriculum Learning in Training Deep Networks

This work analyzes the effect of curriculum learning, which involves the non-uniform sampling of mini-batches, on the training of deep networks, and specifically CNNs trained for image recognition, and defines the concept of an ideal curriculum.