Memory Aware Synapses: Learning what (not) to forget

  title={Memory Aware Synapses: Learning what (not) to forget},
  author={Rahaf Aljundi and Francesca Babiloni and Mohamed Elhoseiny and Marcus Rohrbach and Tinne Tuytelaars},
Humans can learn in a continuous manner. Old rarely utilized knowledge can be overwritten by new incoming information while important, frequently used knowledge is prevented from being erased. In artificial learning systems, lifelong learning so far has focused mainly on accumulating knowledge over tasks and overcoming catastrophic forgetting. In this paper, we argue that, given the limited model capacity and the unlimited new information to be learned, knowledge has to be preserved or erased… 

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

A new kind of connectionist architecture is proposed, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors.

Overcome Anterograde Forgetting with Cycled Memory Networks

Experimental results demonstrate that the CMN can effectively address the anterograde forgetting on several task-related, task-conflict, class-incremental and cross-border tasks.

Attention-Based Selective Plasticity

This work defines an attentionbased selective plasticity of synapses based on the cholinergic neuromodulatory system in the brain and uses Hebbian learning in parallel with backpropagation algorithm to learn synaptic importances in an online and seamless manner.

Homeostasis-Inspired Continual Learning: Learning to Control Structural Regularization

This work reveals that a careful selection of IoR during continual training can remarkably improve the accuracy of tasks, and provides experimental results considering various types of continual learning tasks showing that the proposed method notably outperforms the conventional methods in terms of learning accuracy and knowledge forgetting.

Attention-Based Structural-Plasticity

This work defines an attention-based selective plasticity of synapses based on the cholinergic neuromodulatory system in the brain and uses Hebbian learning in parallel with backpropagation algorithm to learn synaptic importances in an online and seamless manner.

Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization

This study proposes a neuroscience-inspired scheme, called “context-dependent gating,” in which mostly nonoverlapping sets of units are active for any one task, which allows ANNs to maintain high performance across large numbers of sequentially presented tasks, particularly when combined with weight stabilization.


A novel and effective lifelong learning algorithm, called MixEd stochastic GrAdient (MEGA), which allows deep neural networks to acquire the ability of retaining performance on old tasks while learning new tasks.

Local learning rules to attenuate forgetting in neural networks

This work shows that a purely local weight consolidation mechanism, based on estimating energy landscape curvatures from locally available statistics, prevents pattern interference and catastrophic forgetting in the Hopfield network, and clarifies how global information-geometric structure in a learning problem can be exposed in local model statistics.

Enabling Continual Learning with Differentiable Hebbian Plasticity

A Differentiable Hebbian Consolidation model which is composed of a DHP Softmax layer that adds a rapid learning plastic component to the fixed parameters of the softmax output layer; enabling learned representations to be retained for a longer timescale is proposed.

Learning by Active Forgetting for Neural Networks

A learning model by active forgetting mechanism with artificial neural networks sheds light on the importance of forgetting in the learning process and offers new perspectives to understand underlying mechanisms of neural networks.



Improved multitask learning through synaptic intelligence

This study introduces a model of intelligent synapses that accumulate task relevant information over time, and exploits this information to efficiently consolidate memories of old tasks to protect them from being overwritten as new tasks are learned.

Catastrophic forgetting in connectionist networks

  • R. French
  • Computer Science
    Trends in Cognitive Sciences
  • 1999

Overcoming catastrophic forgetting in neural networks

It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.

An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks

It is found that it is always best to train using the dropout algorithm--the drop out algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes.

Connectionist models of recognition memory: constraints imposed by learning and forgetting functions.

  • R. Ratcliff
  • Psychology, Computer Science
    Psychological review
  • 1990
The problems discussed provide limitations on connectionist models applied to human memory and in tasks where information to be learned is not all available during learning.

Gradient Episodic Memory for Continual Learning

A model for continual learning, called Gradient Episodic Memory (GEM) is proposed that alleviates forgetting, while allowing beneficial transfer of knowledge to previous tasks.

Expert Gate: Lifelong Learning with a Network of Experts

A model of lifelong learning, based on a Network of Experts, with a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically forward the test sample to the relevant expert.

Learning without Forgetting

This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.

Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory.

The account presented here suggests that memories are first stored via synaptic changes in the hippocampal system, that these changes support reinstatement of recent memories in the neocortex, that neocortical synapses change a little on each reinstatement, and that remote memory is based on accumulated neocorticals changes.