• Corpus ID: 235614241

Bayesian Context Aggregation for Neural Processes

@inproceedings{Volpp2021BayesianCA,
  title={Bayesian Context Aggregation for Neural Processes},
  author={Michael Volpp and Fabian Fl{\"u}renbrock and Lukas Gro{\ss}berger and Christian Daniel and Gerhard Neumann},
  booktitle={ICLR},
  year={2021}
}
Formulating scalable probabilistic regression models with reliable uncertainty estimates has been a long-standing challenge in machine learning research. Recently, casting probabilistic regression as a multi-task learning problem in terms of conditional latent variable (CLV) models such as the Neural Process (NP) has shown promising results. In this paper, we focus on context aggregation, a central component of such architectures, which fuses information from multiple context data points. So… 
Multi-Task Neural Processes
TLDR
The proposed multi-task neural processes derive the function priors in a hierarchical Bayesian inference framework, which enables each task to incorporate the shared knowledge provided by related tasks into its context of the prediction function.
Neural Processes with Stochastic Attention: Paying more attention to the context dataset
TLDR
This work proposes a stochastic attention mechanism for NPs to capture appropriate context information and empirically shows that this approach substantially outperforms conventional NPs in various domains through 1D regression, predator-prey model, and image completion.
What Matters For Meta-Learning Vision Regression Tasks?
TLDR
This paper designs two new types of cross-category level vision regression tasks, namely object discovery and pose estimation of unprecedented complexity in the meta-learning domain for computer vision and proposes the addition of functional contrastive learning (FCL) over the task representations in Conditional Neural Processes.
HIDDEN PARAMETER RECURRENT STATE SPACE MODELS FOR CHANGING DYNAMICS SCENARIOS
TLDR
This work introduces the Hidden Parameter Recurrent State Space Models (HiP-RSSMs), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent factors that outperforms RSSMs and competing multi-task models on several challenging robotic benchmarks both on real-world systems and simulations.
Multi-fidelity Hierarchical Neural Processes
TLDR
Multifidelity Hierarchical Neural Processes (MF-HNP), a unified neural latent variable model for multi-fidelity surrogate modeling, shows great promise for speeding up high-dimensional complex simulations and achieves competitive performance in terms of accuracy and uncertainty estimation.

References

SHOWING 1-10 OF 63 REFERENCES
Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
TLDR
This work reformulates the model-agnostic meta-learning algorithm (MAML) of Finn et al. (2017) as a method for probabilistic inference in a hierarchical Bayesian model and proposes an improvement to the MAML algorithm that makes use of techniques from approximate inference and curvature estimation.
Meta-Learning Probabilistic Inference for Prediction
TLDR
VERSA is introduced, an instance of the framework employing a flexible and versatile amortization network that takes few-shot learning datasets as inputs, with arbitrary numbers of shots, and outputs a distribution over task-specific parameters in a single forward pass, amortizing the cost of inference and relieving the need for second derivatives during training.
The Functional Neural Process
TLDR
A new family of exchangeable stochastic processes, the Functional Neural Processes (FNPs), are presented and it is demonstrated that they are scalable to large datasets through mini-batch optimization and described how they can make predictions for new points via their posterior predictive distribution.
Learning Structured Output Representation using Deep Conditional Generative Models
TLDR
A deep conditional generative model for structured output prediction using Gaussian latent variables is developed, trained efficiently in the framework of stochastic gradient variational Bayes, and allows for fast prediction using Stochastic feed-forward inference.
Sparse Gaussian Processes using Pseudo-inputs
TLDR
It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
TLDR
A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.
Scalable Hyperparameter Transfer Learning
TLDR
This work proposes a multi-task adaptive Bayesian linear regression model for transfer learning in BO, whose complexity is linear in the function evaluations: one Bayesianlinear regression model is associated to each black-box function optimization problem (or task), while transfer learning is achieved by coupling the models through a shared deep neural net.
Probabilistic Model-Agnostic Meta-Learning
TLDR
This paper proposes a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution that is trained via a variational lower bound, and shows how reasoning about ambiguity can also be used for downstream active learning problems.
Bayesian Model-Agnostic Meta-Learning
TLDR
The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework and is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation during fast adaptation.
Task Clustering and Gating for Bayesian Multitask Learning
TLDR
A Bayesian approach is adopted in which some of the model parameters are shared and others more loosely connected through a joint prior distribution that can be learned from the data to combine the best parts of both the statistical multilevel approach and the neural network machinery.
...
...