• Corpus ID: 227162606

Generalized Variational Continual Learning

  title={Generalized Variational Continual Learning},
  author={Noel Loo and Siddharth Swaroop and Richard E. Turner},
Continual learning deals with training models on new tasks and datasets in an online fashion. One strand of research has used probabilistic regularization for continual learning, with two of the main approaches in this vein being Online Elastic Weight Consolidation (Online EWC) and Variational Continual Learning (VCL). VCL employs variational inference, which in other settings has been improved empirically by applying likelihood-tempering. We show that applying this modification to VCL recovers… 

Mixture-of-Variational-Experts for Continual Learning

This work proposes an optimality principle that facilitates a trade-off between learning and forgetting and proposes a neural network layer for continual learning, called Mixture-of-Variational-Experts (MoVE), that alleviates forgetting while enabling the beneficial transfer of knowledge to new tasks.

Continual Learning via Sequential Function-Space Variational Inference

It is demonstrated that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods while depending less on maintaining a set of representative points from previous tasks.

Dynamic VAEs with Generative Replay for Continual Zero-shot Learning

A novel continual zero-shot learning (DVGR-CZSL) model that grows in size with each task and uses generative replay to update itself with previously learned classes to avoid forgetting is proposed.

Collapsed Variational Bounds for Bayesian Neural Networks

The new bounds significantly improve the performance of Gaussian mean-field VI applied to BNNs on a variety of data sets, and are found that the tighter ELBOs can be good optimization targets for learning the hyperparameters of hierarchical priors.

Continual Learning with Dependency Preserving Hypernetworks

The proposed methods based on the RNN hypernetworks outperformed the baselines in all these CL settings and tasks and are proposed to use recurrent neural network (RNN) based hypernetwork that can generate layer weights efficiently while allowing for dependencies across them.

Posterior Meta-Replay for Continual Learning

This work study principled ways to tackle the CL problem by adopting a Bayesian perspective and focus on continually learning a task-specific posterior distribution via a shared meta-model, atask-conditioned hypernetwork, in sharp contrast to most Bayesian CL approaches that focus on the recursive update of a single posterior distribution.

Variational Continual Proxy-Anchor for Deep Metric Learning

This paper extends the proxy-anchor method by posing it within the continual learning framework, motivated from its batch-expected loss form (instead of instance-expected, typical in deep learn-ing), which can potentially incur the catastrophic forgetting of historic batches.

Natural continual learning: success is a journey, not (just) a destination

The proposed Natural Continual Learning (NCL), a new method that uses Bayesian weight regularization to encourage good performance on all tasks at convergence and combines this with gradient projection using the prior precision, which prevents catastrophic forgetting during optimization.

Continual Learning of Multi-modal Dynamics with External Memory

A novel continual learning method that maintains a descriptor of the mode of an encountered sequence in a neural episodic memory that performs continual learning by transferring knowledge across tasks by retrieving the descriptors of similar modes of past tasks to a current sequence and feeding this descriptor into its transition kernel as control input.



Improving and Understanding Variational Continual Learning

This paper reports significantly improved results on what was already a competitive approach to mean-field variational Bayesian neural networks, and compares the solution to what an `ideal' continual learning solution might be.

Variational Continual Learning

Variational continual learning is developed, a simple but general framework for continual learning that fuses online variational inference and recent advances in Monte Carlo VI for neural networks that outperforms state-of-the-art continual learning methods.

Continual Learning with Adaptive Weights (CLAW)

An approach called Continual Learning with Adaptive Weights (CLAW), which is based on probabilistic modelling and variational inference, is introduced, which achieves state-of-the-art performance on six benchmarks in terms of overall continual learning performance, as measured by classification accuracy, and in Terms of addressing catastrophic forgetting.

SOLA: Continual Learning with Second-Order Loss Approximation

This work study continual learning from the perspective of loss landscapes and proposes to construct a second-order Taylor approximation of the loss functions in previous tasks, which is effective in avoiding catastrophic forgetting and outperforms several baseline algorithms that do not explicitly store the data samples.

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

Learning an interpretable factorised representation of the independent data generative factors of the world without supervision is an important precursor for the development of artificial

Practical Deep Learning with Bayesian Principles

This work enables practical deep learning while preserving benefits of Bayesian principles, and applies techniques such as batch normalisation, data augmentation, and distributed training to achieve similar performance in about the same number of epochs as the Adam optimiser.

Variational Dropout Sparsifies Deep Neural Networks

Variational Dropout is extended to the case when dropout rates are unbounded, a way to reduce the variance of the gradient estimator is proposed and first experimental results with individual drop out rates per weight are reported.

Fixing a Broken ELBO

This framework derives variational lower and upper bounds on the mutual information between the input and the latent variable, and uses these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy.

Noisy Natural Gradient as Variational Inference

It is shown that natural gradient ascent with adaptive weight noise implicitly fits a variational posterior to maximize the evidence lower bound (ELBO), which allows us to train full-covariance, fully factorized, or matrix-variate Gaussian variational posteriors using noisy versions of natural gradient, Adam, and K-FAC, respectively, making it possible to scale up to modern-size ConvNets.

Progress & Compress: A scalable framework for continual learning

The progress & compress approach is demonstrated on sequential classification of handwritten alphabets as well as two reinforcement learning domains: Atari games and 3D maze navigation.