Continuous Learning of Context-dependent Processing in Neural Networks

@article{Zeng2019ContinuousLO,
  title={Continuous Learning of Context-dependent Processing in Neural Networks},
  author={Guanxiong Zeng and Yang Chen and Bo Cui and Shan Yu},
  journal={Nat. Mach. Intell.},
  year={2019},
  volume={1},
  pages={364-372}
}
Deep artificial neural networks (DNNs) are powerful tools for recognition and classification as they learn sophisticated mapping rules between the inputs and the outputs. [] Key Result This would enable highly compact systems to gradually learn myriad of regularities of the real world and eventually behave appropriately within it.
Efficient and robust multi-task learning in the brain with modular latent primitives
TLDR
A system that learns new tasks by building on top of pre-trained latent dynamics organised into separate recurrent modules, which allows for learning multiple tasks while keeping parameter counts, and updates, low and draws a path towards cheaper multitask solutions.
A Unified Framework for Lifelong Learning in Deep Neural Networks
TLDR
This paper proposes a simple yet powerful unified framework that demonstrates all of these desirable properties of lifelong learning, including non-forgetting, concept rehearsal, forward transfer and backward transfer of knowledge, and so on.
Gradient Projection Memory for Continual Learning
TLDR
This work proposes a novel approach where a neural network learns new tasks by taking gradient steps in the orthogonal direction to the gradient subspaces deemed important for the past tasks, and shows that this induces minimum to no interference with thepast tasks, thereby mitigates forgetting.
Orthogonal Gradient Descent for Continual Learning
TLDR
The Orthogonal Gradient Descent (OGD) method is presented, which accomplishes this goal by projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task.
Efficient and robust multi-task learning in the brain with modular task primitives
TLDR
This work shows that a modular network endowed with task primitives allows for learning multiple tasks well while keeping parameter counts, and updates, low, and makes predictions for novel neuroscience experiments in which targeted perturbations are employed to explore solution spaces.
Continual Learning Using Multi-view Task Conditional Neural Networks
TLDR
The proposed Multi-view Task Conditional Neural Networks (Mv-TCNN) that does not require to known the reoccurring tasks in advance outperforms the state-of-the-art solutions in continual learning and adapting to new tasks that are not defined in advance.
LEARNING WITH LONG-TERM REMEMBERING: FOL-
TLDR
A novel and effective lifelong learning algorithm, called MixEd stochastic GrAdient (MEGA), which allows deep neural networks to acquire the ability of retaining performance on old tasks while learning new tasks.
Continual Learning Using Task Conditional Neural Networks
TLDR
This work proposes Task Conditional Neural Networks (TCNN) that does not require to known the reoccurring tasks in advance and outperforms the state-of-the-art solutions in continual learning and adapting to new tasks that are not defined in advance.
Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient
TLDR
A novel and effective lifelong learning algorithm, calledMixEd stochastic GrAdient (MEGA), which allows deep neural networks to ac-quire the ability of retaining performance on old tasks while learning new tasks.
A biologically inspired architecture with switching units can learn to generalize across backgrounds
TLDR
This study shows how a bio-inspired architectural motif can contribute to task generalization across context, using a novel dataset to learn digit classification and proposes an architecture with additional “ switching” units that are activated in the presence of a new background.
...
...

References

SHOWING 1-10 OF 77 REFERENCES
Overcoming catastrophic forgetting in neural networks
TLDR
It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.
Context-dependent computation by recurrent dynamics in prefrontal cortex
TLDR
This work studies prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice, and finds that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population.
An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks
TLDR
It is found that it is always best to train using the dropout algorithm--the drop out algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes.
Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization
TLDR
This study proposes a neuroscience-inspired scheme, called “context-dependent gating,” in which mostly nonoverlapping sets of units are active for any one task, which allows ANNs to maintain high performance across large numbers of sequentially presented tasks, particularly when combined with weight stabilization.
Controlling Recurrent Neural Networks by Conceptors
TLDR
A mechanism of neurodynamical organization, called conceptors, is proposed, which unites nonlinear dynamics with basic principles of conceptual abstraction and logic, and helps explain how conceptual-level information processing emerges naturally and robustly in neural systems.
How transferable are features in deep neural networks?
TLDR
This paper quantifies the generality versus specificity of neurons in each layer of a deep convolutional neural network and reports a few surprising results, including that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Connectionist models of recognition memory: constraints imposed by learning and forgetting functions.
  • R. Ratcliff
  • Psychology, Computer Science
    Psychological review
  • 1990
TLDR
The problems discussed provide limitations on connectionist models applied to human memory and in tasks where information to be learned is not all available during learning.
Understanding the difficulty of training deep feedforward neural networks
TLDR
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
Continual Learning Through Synaptic Intelligence
TLDR
This study introduces intelligent synapses that bring some of this biological complexity into artificial neural networks, and shows that it dramatically reduces forgetting while maintaining computational efficiency.
...
...