Learning to Continually Learn

  title={Learning to Continually Learn},
  author={Shawn L. E. Beaulieu and Lapo Frati and Thomas Miconi and Joel Lehman and Kenneth O. Stanley and Jeff Clune and Nick Cheney},
  booktitle={European Conference on Artificial Intelligence},
Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it. Much work has gone towards preventing the default tendency of machine learning models to catastrophically forget, yet virtually all such work involves manually-designed solutions to the problem. We instead advocate meta-learning a solution to catastrophic forgetting, allowing AI to learn to continually learn. Inspired by… 

Figures from this paper

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

A new kind of connectionist architecture is proposed, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors.

A Meta-Learned Neuron model for Continual Learning

This work replaces the standard neuron by a metalearned neuron model whom inference and update rules are optimized to minimize catastrophic interference, and can memorize dataset-length sequences of training samples, and its learning capabilities generalize to any domain.

Learning to modulate random weights can induce task-specific contexts for economical meta and continual learning

A single-hidden-layer network that learns only a relatively small context vector per task that neuromodulates unchanging, randomized weights that transform the input that can be generalized into a framework to perform continual learning without knowledge of task boundaries is introduced.

Meta-Consolidation for Continual Learning

The authors' experiments with continual learning benchmarks of MNIST, CifAR-10, CIFAR-100 and Mini-ImageNet datasets show consistent improvement over five baselines, including a recent state-of-the-art, corroborating the promise of MERLIN.

Few-shot Continual Learning: a Brain-inspired Approach

This paper provides a first systematic study on FSCL and presents an effective solution with deep neural networks based on the observation that continual learning of a task sequence inevitably interferes few-shot generalization.

Continual learning under domain transfer with sparse synaptic bursting

A system that can learn sequentially over previously unseen datasets (ImageNet, CIFAR-100) with little forgetting over time is introduced by controlling the activity of weights in a convolutional neural network on the basis of inputs using top-down regulation generated by a second feed-forward neural network.

Prototypes-Guided Memory Replay for Continual Learning

A dynamic prototypesguided memory replay module is devised, incorporating it into an online meta-learning model, and the experimental results testify the superiority of the method in alleviating catastrophic forgetting and enabling efficient knowledge transfer.

Wide Neural Networks Forget Less Catastrophically

This work focuses on the model itself and study the impact of “width” of the neural network architecture on catastrophic forgetting, and shows that width has a surprisingly significant effect on forgetting.

Condensed Composite Memory Continual Learning

  • Felix WiewelBinh Yang
  • Computer Science
    2021 International Joint Conference on Neural Networks (IJCNN)
  • 2021
This work proposes a novel way of learning a small set of synthetic examples which capture the essence of a complete dataset and learns a weighted combination of shared components for each example that enables a significant increase in memory efficiency.

Do What Nature Did To Us: Evolving Plastic Recurrent Neural Networks For Task Generalization

The experiment results demonstrate the unique advantage of EPRNN compared to state-of-the-arts based on plasticity and recursion while yielding comparably good performance against deep learning based approaches in the tasks.



Overcoming catastrophic forgetting in neural networks

It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.

Meta-Learning Representations for Continual Learning

It is shown that it is possible to learn naturally sparse representations that are more effective for online updating and it is demonstrated that a basic online updating strategy on representations learned by OML is competitive with rehearsal based methods for continual learning.

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

The results suggest that diffusion-based neuromodulation promotes task-specific localized learning and functional modularity, which can help solve the challenging, but important problem of catastrophic forgetting.

Attention-Based Selective Plasticity

This work defines an attentionbased selective plasticity of synapses based on the cholinergic neuromodulatory system in the brain and uses Hebbian learning in parallel with backpropagation algorithm to learn synaptic importances in an online and seamless manner.

Neural Modularity Helps Organisms Evolve to Learn New Skills without Forgetting Old Skills

It is suggested that encouraging modularity in neural networks may help to overcome the long-standing barrier of networks that cannot learn new skills without forgetting old ones, and that one benefit of the modularity ubiquitous in the brains of natural animals might be to alleviate the problem of catastrophic forgetting.

Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization

This study proposes a neuroscience-inspired scheme, called “context-dependent gating,” in which mostly nonoverlapping sets of units are active for any one task, which allows ANNs to maintain high performance across large numbers of sequentially presented tasks, particularly when combined with weight stabilization.

Learning to reinforcement learn

This work introduces a novel approach to deep meta-reinforcement learning, which is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure.

Catastrophic forgetting in connectionist networks

  • R. French
  • Computer Science
    Trends in Cognitive Sciences
  • 1999

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.

Differentiable plasticity: training plastic neural networks with backpropagation

It is shown that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections, and it is concluded that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem.