• Corpus ID: 219179403

Meta Learning as Bayes Risk Minimization

  title={Meta Learning as Bayes Risk Minimization},
  author={Shin-ichi Maeda and Toshiki Nakanishi and Masanori Koyama},
Meta-Learning is a family of methods that use a set of interrelated tasks to learn a model that can quickly learn a new query task from a possibly small contextual dataset. In this study, we use a probabilistic framework to formalize what it means for two tasks to be related and reframe the meta-learning problem into the problem of Bayesian risk minimization (BRM). In our formulation, the BRM optimal solution is given by the predictive distribution computed from the posterior distribution of… 


Meta-Learning Probabilistic Inference for Prediction
VERSA is introduced, an instance of the framework employing a flexible and versatile amortization network that takes few-shot learning datasets as inputs, with arbitrary numbers of shots, and outputs a distribution over task-specific parameters in a single forward pass, amortizing the cost of inference and relieving the need for second derivatives during training.
Bayesian Model-Agnostic Meta-Learning
The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework and is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation during fast adaptation.
Neural Processes
This work introduces a class of neural latent variable models which it calls Neural Processes (NPs), combining the best of both worlds: probabilistic, data-efficient and flexible, however they are also computationally intensive and thus limited in their applicability.
The Functional Neural Process
A new family of exchangeable stochastic processes, the Functional Neural Processes (FNPs), are presented and it is demonstrated that they are scalable to large datasets through mini-batch optimization and described how they can make predictions for new points via their posterior predictive distribution.
Meta-Learning with Adaptive Layerwise Metric and Subspace
This paper presents a feedforward neural network, referred to as T-net, where the linear transformation between two adjacent layers is decomposed as T W such that W is learned by task-specific learners and the transformation T is meta-learned to speed up the convergence of gradient updates for task- specific learners.
On First-Order Meta-Learning Algorithms
A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning
Towards a Neural Statistician
An extension of a variational autoencoder that can learn a method for computing representations, or statistics, of datasets in an unsupervised fashion is demonstrated that is able to learn statistics that can be used for clustering datasets, transferring generative models to new datasets, selecting representative samples of datasets and classifying previously unseen classes.
Meta-Learning with Warped Gradient Descent
WarpGrad meta-learns an efficiently parameterised preconditioning matrix that facilitates gradient descent across the task distribution and is computationally efficient, easy to implement, and can scale to arbitrarily large meta-learning problems.
We propose meta-curvature (MC), a framework to learn curvature information for better generalization and fast model adaptation. MC expands on the model-agnostic meta-learner (MAML) by learning to