• Corpus ID: 238198171

ST-MAML: A Stochastic-Task based Method for Task-Heterogeneous Meta-Learning

@article{Wang2021STMAMLAS,
  title={ST-MAML: A Stochastic-Task based Method for Task-Heterogeneous Meta-Learning},
  author={Zhe Wang and Jake Grigsby and Arshdeep Sekhon and Yanjun Qi},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.13305}
}
Optimization-based meta-learning typically assumes tasks are sampled from a single distribution – an assumption that oversimplifies and lim-its the diversity of tasks that meta-learning can model. Handling tasks from multiple distributions is challenging for meta-learning because it adds ambiguity to task identities. This paper proposes a novel method, ST-MAML , that empowers model-agnostic meta-learning ( MAML ) to learn from multiple task distributions. ST-MAML encodes tasks using a stochastic… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 37 REFERENCES

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

This paper proposes a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior parameters according to the identified mode, allowing more efficient fast adaptation and demonstrating the effectiveness of the model in modulating the meta-learning prior in response to the characteristics of tasks.

Hierarchically Structured Meta-learning

A hierarchically structured meta-learning (HSML) algorithm that explicitly tailors the transferable knowledge to different clusters of tasks, inspired by the way human beings organize knowledge, and extends the hierarchical structure to a continual learning environment.

Probabilistic Model-Agnostic Meta-Learning

This paper proposes a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution that is trained via a variational lower bound, and shows how reasoning about ambiguity can also be used for downstream active learning problems.

Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace

This work demonstrates that the dimension of this learned subspace reflects the complexity of the task-specific learner's adaptation task, and also that the model is less sensitive to the choice of initial learning rates than previous gradient-based meta-learning methods.

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

This work reformulates the model-agnostic meta-learning algorithm (MAML) of Finn et al. (2017) as a method for probabilistic inference in a hierarchical Bayesian model and proposes an improvement to the MAML algorithm that makes use of techniques from approximate inference and curvature estimation.

Amortized Bayesian Meta-Learning

This work proposes a meta-learning method which efficiently amortizes hierarchical variational inference across tasks, learning a prior distribution over neural network weights so that a few steps of Bayes by Backprop will produce a good task-specific approximate posterior.

Automated Relational Meta-learning

An automated relational meta-learning (ARML) framework that automatically extracts the cross-task relations and constructs the meta-knowledge graph is proposed and the results demonstrate the superiority of ARML over state-of-the-art baselines.

Meta-Learning with Adaptive Hyperparameters

A new weight update rule is proposed that greatly enhances the fast adaptation process in MAML framework, inner-loop optimization (or fast adaptation), and introduces a small meta-network that can adaptively generate per-step hyperparameters: learning rate and weight decay coefficients.

Meta-SGD: Learning to Learn Quickly for Few Shot Learning

Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, shows highly competitive performance for few-shot learning on regression, classification, and reinforcement learning.

Bayesian Model-Agnostic Meta-Learning

The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework and is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation during fast adaptation.