Reduction of catastrophic forgetting with transfer learning and ternary output codes

@article{Gutstein2015ReductionOC,
  title={Reduction of catastrophic forgetting with transfer learning and ternary output codes},
  author={Steven Gutstein and Ethan Stump},
  journal={2015 International Joint Conference on Neural Networks (IJCNN)},
  year={2015},
  pages={1-8}
}
Historically, neural nets have learned new things at the cost of forgetting what they already know. [] Key Method Our approach is unique in that it both uses transfer learning to mitigate catastrophic forgetting and focuses upon the output nodes of a neural network. This results in a technique that makes it easier rather than harder to learn new tasks while retaining existing knowledge; is architecture independent and trivial to implement on any existing net. Additionally, we examine the use of ternary…

Figures and Tables from this paper

Overcoming catastrophic forgetting with hard attention to the task

TLDR
A task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning, and features the possibility to control both the stability and compactness of the learned knowledge, which makes it also attractive for online learning or network compression applications.

Understanding Forgetting in Artificial Neural Networks

TLDR
This thesis confirms the existence and severity of catastrophic forii getting in some contemporary machine learning systems by showing that it appears when a simple, modern ANN is trained using a conventional algorithm incrementally on a well-known multi-class classification setting (MNIST).

The effects of output codes on transfer learning in a deep convolutional neural net

TLDR
This paper compares how effectively different non-semantic encodings are at causing a neural net to implicitly learnEncodings for unobserved classes, and looks for evidence of a phenomenon akin to over-training.

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

TLDR
The results suggest that diffusion-based neuromodulation promotes task-specific localized learning and functional modularity, which can help solve the challenging, but important problem of catastrophic forgetting.

Overcoming Catastrophic Forgetting in Convolutional Neural Networks by Selective Network Augmentation

TLDR
A method to overcome catastrophic forgetting on convolutional neural networks, that learns new tasks and preserves the performance on old tasks without accessing the data of the original model, by selective network augmentation (SeNA-CNN).

DeepObliviate: A Powerful Charm for Erasing Data Residual Memory in Deep Neural Networks

TLDR
This paper proposes an approach, dubbed as DEEPOBLIVIATE, to implement machine unlearning efficiently, without modifying the normal training mode, and improves the original training process by storing intermediate models on the hard disk.

Catastrophic Interference in Disguised Face Recognition

TLDR
This work empirically evaluates several commonly used DCNN architectures on Face Recognition and distill some insights about the effect of sequential learning on distinct identities from different datasets, showing that the catastrophic forgetness phenomenon is present even in feature embeddings fine-tuned on different tasks from the original domain.

Plasticity and Firing Rate Dynamics in Leaky Integrate-and-Fire Models of Cortical Circuits

TLDR
This work aims to shed some light on firing rate dynamics as well as on how plasticity may play a role in developing cortical circuits, and investigates learning in artificial neural networks.

An object recognition system based on convolutional neural networks and angular resolutions

TLDR
This work proposes a novel algorithm, making use of angular resolutions and convolutional neural networks for 3D object recognition, and it collects image shapes or contours from real objects by placing them on a rotating display to record the appearances from multiple angular views.

Object-Based Augmentation for Building Semantic Segmentation: Ventura and Santa Rosa Case Study

TLDR
This study proposes a novel pipeline for georeferenced image augmentation that enables a significant increase in the number of training samples and leads to the meaningful improvement of U-Net model predictions from 0.78 to 0.83 F1-score.

References

SHOWING 1-10 OF 27 REFERENCES

An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks

TLDR
It is found that it is always best to train using the dropout algorithm--the drop out algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes.

Using pseudo-recurrent connectionist networks to solve the problem of sequential learning

TLDR
A method is described in which approximations of the previously learned data will be extracted from the network and mixed in with the new patterns to be learned, thereby alleviating sudden forgetting caused by new learning and allowing the network to forget gracefully.

Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks

TLDR
A simple algorithm is presented that allows a standard feedforward backpropagation network to develop semi-distributed representations, thereby significantly reducing the problem of catastrophic forgetting.

Catastrophic Forgetting, Rehearsal and Pseudorehearsal

TLDR
A solution to the problem of catastrophic forgetting in neural networks is described, 'pseudorehearsal', a method which provides the advantages of rehearsal without actually requiring any access to the previously learned information (the original training population) itself.

Self-refreshing memory in artificial neural networks: learning temporal sequences without catastrophic forgetting

TLDR
A dual-network architecture is developed in which self-generated pseudopatterns reflect (non-temporally) all the sequences of temporally ordered items previously learned.

Latent learning - What your net also learned

TLDR
This paper uses a convolutional neural net to demonstrate not only a method of determining a net's latent responses, but also simple ways to optimize latent learning, and takes advantage of the fact that CNN's are deep nets to show how the latently learned accuracy of the CNN may be greatly improved by allowing only its output layer to train.

Mitigation of catastrophic interference in neural networks using a fixed expansion layer

  • R. CoopI. Arel
  • Computer Science
    2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS)
  • 2012
TLDR
The fixed expansion layer (FEL) feedforward neural network designed for balancing plasticity and stability in the presence of non-stationary inputs is presented and it is demonstrated that the FEL network is able to retain information for significantly longer periods of time with substantially lower computational requirements.

Using fast weights to deblur old memories

TLDR
All the original associations of a network can be "deblurred" by rehearsing on just a few of them by allowing the fast weights to take on values that temporarily cancel out the changes in the slow weights caused by the subsequent learning.

The Cascade-Correlation Learning Architecture

TLDR
The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no back-propagation of error signals through the connections of the network.

Improving neural networks by preventing co-adaptation of feature detectors

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the