• Corpus ID: 227151261

Energy-Based Models for Continual Learning

  title={Energy-Based Models for Continual Learning},
  author={Shuang Li and Yilun Du and Gido M. van de Ven and Antonio Torralba and Igor Mordatch},
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs have a natural way to support a dynamically-growing number of tasks or classes that causes less interference with previously learned information. We find that EBMs outperform the baseline methods by a large margin on several continual learning benchmarks. We also show that EBMs are… 
Posterior Meta-Replay for Continual Learning
This work study principled ways to tackle the CL problem by adopting a Bayesian perspective and focus on continually learning a task-specific posterior distribution via a shared meta-model, atask-conditioned hypernetwork, in sharp contrast to most Bayesian CL approaches that focus on the recursive update of a single posterior distribution.
Energy-based Latent Aligner for Incremental Learning
The implicit regularization that is offered by the proposed ELI methodology can be used as a plug-and-play module in existing incremental learning methodologies, corroborating its effectiveness and complementary advantage to the existing art.
Exemplar-free Class Incremental Learning via Discriminative and Comparable One-class Classifiers
DisCOIL follows the basic principle of POC, but it adopts variational auto-encoders (VAE) instead of other well-established oneclass classifiers (e.g. deep SVDD), because a trained VAE can not only identify the probability of an input sample belonging to a class but also generate pseudo samples of the class to assist in learning new tasks.
Class-Incremental Learning with Generative Classifiers
This paper proposes a new strategy for class-incremental learning: generative classification, which is to learn the joint distribution p(x, y), factorized as p( x|y)p(y), and to perform classification using Bayes’ rule.
Directly Training Joint Energy-Based Models for Conditional Synthesis and Calibrated Prediction of Multi-Attribute Data
It is shown that architectures for multi-attribute prediction can be reinterpreted as energy-based models (EBMs) and are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.
Improved Contrastive Divergence Training of Energy Based Models
It is shown that a gradient term neglected in the popular contrastive divergence formulation is both tractable to estimate and important to avoid training instabilities in previous models, and how data augmentation, multi-scale processing, and reservoir sampling can be used to improve model robustness and generation quality.
Mitigating Forgetting in Online Continual Learning with Neuron Calibration
A novel method which attempts to mitigate catastrophic forgetting in online continual learning from a new perspective, i.e., neuron calibration is presented, which is lightweight and applicable to general feed-forward neural networks-based models.
Particle Dynamics for Learning EBMs
This paper proposes an alternative approach to getting samples and avoiding crude MCMC sampling from the current model that targets to match the current distribution in a finite time, and demonstrates its effectiveness empirically comparing to MCMC-based learning methods.
The CLEAR Benchmark: Continual LEArning on Real-World Imagery
This paper introduces CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts in the real world that spans a decade (2004-2014), and proposes a novel "streaming" protocol for CL that always test on the (near) future.


Distributional Reinforcement Learning for Energy-Based Sequential Models
A distillation technique is proposed, which can only be applied under limited conditions and is illustrated on GAM-based experiments, and is a general approach applicable to any sequential EBM.
Task-Free Continual Learning
This work investigates how to transform continual learning to an online setup, and develops a system that keeps on learning over time in a streaming fashion, with data distributions gradually changing and without the notion of separate tasks.
Three scenarios for continual learning
Three continual learning scenarios are described based on whether at test time task identity is provided and--in case it is not--whether it must be inferred, and it is found that regularization-based approaches fail and that replaying representations of previous experiences seems required for solving this scenario.
Task-Agnostic Continual Learning Using Online Variational Bayes With Fixed-Point Updates
This work derives novel fixed-point equations for the online variational Bayes optimization problem for multivariate gaussian parametric distributions and obtains an algorithm (FOO-VB) that can handle nonstationary data distribution using a fixed architecture and without using external memory.
Gradient Episodic Memory for Continual Learning
A model for continual learning, called Gradient Episodic Memory (GEM) is proposed that alleviates forgetting, while allowing beneficial transfer of knowledge to previous tasks.
Generative Models from the perspective of Continual Learning
It is found that among all models, the original GAN performs best and among Continual Learning strategies, generative replay outperforms all other methods.
Reinforcement Learning with Deep Energy-Based Policies
A method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before, is proposed and a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution is applied.
Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation
This paper proposes a very different approach, called Parameter Generation and Model Adaptation (PGMA), to dealing with the problem of catastrophic forgetting in standard neural network architectures.
A Tutorial on Energy-Based Learning
The EBM approach provides a common theoretical framework for many learning models, including traditional discr iminative and generative approaches, as well as graph-transformer networks, co nditional random fields, maximum margin Markov networks, and several manifold learning methods.
Meta-Learning Deep Energy-Based Memory Models
A novel meta-learning approach to energy-based memory models (EBMM) that allows one to use an arbitrary neural architecture as an energy model and quickly store patterns in its weights and is capable of associative retrieval that outperforms existing memory systems in terms of the reconstruction error and compression rate.