• Corpus ID: 234357710

TAG: Task-based Accumulated Gradients for Lifelong learning

@article{Malviya2021TAGTA,
  title={TAG: Task-based Accumulated Gradients for Lifelong learning},
  author={Pranshu Malviya and Balaraman Ravindran and A. P. Sarath Chandar},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.05155}
}
The experiments are performed on four benchmark datasets: Split-CIFAR100 , Split-miniImageNet , Split-CUB and 5-dataset . Split-CIFAR100 and Split-miniImageNet splits the CIFAR-100 (Krizhevsky et al., 2009; Mirzadeh et al., 2020) and Mini-imagenet (Vinyals et al., 2016; Chaudhry et al., 2019) datasets into 20 disjoint 5-way classification tasks. Split-CUB splits the CUB (Wah et al., 2011) dataset into 20 disjoint tasks with 10 classes per task. 5-dataset is a sequence of five different datasets… 

Model Zoo: A Growing Brain That Learns Continually

This paper uses statistical learning theory and experimental analysis to show how multiple tasks can interact with each other in a non-trivial fashion when a single model is trained on them, and motivates a method named Model Zoo which grows an ensemble of small models, each of which is trained during one episode of continual learning.

Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity

This work proposes two algorithms: representation ensembles of (1) trees and (2) networks, which demonstrate both forward and backward transfer in a variety of simulated and real data scenarios, including tabular, image, and spoken, and adversarial tasks.

References

SHOWING 1-10 OF 66 REFERENCES

Gradient Based Memory Editing for Task-Free Continual Learning

This paper proposes a principled approach to "edit" stored examples which aims to carry more updated information from the data stream in the memory and uses gradient updates to edit stored examples so that they are more likely to be forgotten in future updates.

Expert Gate: Lifelong Learning with a Network of Experts

A model of lifelong learning, based on a Network of Experts, with a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically forward the test sample to the relevant expert.

Continual Unsupervised Representation Learning

The proposed approach (CURL) performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting.

Continual Learning with Adaptive Weights (CLAW)

An approach called Continual Learning with Adaptive Weights (CLAW), which is based on probabilistic modelling and variational inference, is introduced, which achieves state-of-the-art performance on six benchmarks in terms of overall continual learning performance, as measured by classification accuracy, and in Terms of addressing catastrophic forgetting.

Class-incremental learning: survey and performance evaluation

This paper provides a complete survey of existing methods for incremental learning, and in particular an extensive experimental evaluation on twelve class-incremental methods, including a comparison of class-increasing methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures.

Lifelong Learning with Dynamically Expandable Networks

The obtained network fine-tuned on all tasks obtained significantly better performance over the batch models, which shows that it can be used to estimate the optimal network structure even when all tasks are available in the first place.

Continual learning: A comparative study on how to defy forgetting in classification tasks

This work focuses on task-incremental classification, where tasks arrive in a batch-like fashion, and are delineated by clear boundaries, and studies the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, to compare methods in terms of required memory, computation time and storage.

Efficient Lifelong Learning with A-GEM

An improved version of GEM is proposed, dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC and other regularization-based methods.

La-MAML: Look-ahead Meta Learning for Continual Learning

This work proposes Look-ahead MAML (La-MAML), a fast optimisation-based meta-learning algorithm for online-continual learning, aided by a small episodic memory, and proposed modulation of per-parameter learning rates in this update provides a more flexible and efficient way to mitigate catastrophic forgetting compared to conventional prior-based methods.

On Tiny Episodic Memories in Continual Learning

This work empirically analyze the effectiveness of a very small episodic memory in a CL setup where each training example is only seen once and finds that repetitive training on even tiny memories of past tasks does not harm generalization, on the contrary, it improves it.
...