• Corpus ID: 6719686

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

  title={Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks},
  author={Chelsea Finn and P. Abbeel and Sergey Levine},
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. [] Key Method In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot…

Figures and Tables from this paper

First-order Meta-Learned Initialization for Faster Adaptation in Deep Reinforcement Learning

It is shown that the effect of first-order approximations on recently proposed meta-learning algorithms which use the gradient updates of the learner itself perform just as well as their second-order counterparts while being 1 .

Regularizing Meta-Learning via Gradient Dropout

This paper introduces a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning by randomly dropping the gradient in the inner-loop optimization of each parameter in deep neural networks, such that the augmented gradients improve generalization to new tasks.

B-Small: A Bayesian Neural Network Approach to Sparse Model-Agnostic Meta-Learning

  • Anish MadanRanjitha Prasad
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
The proposed framework incorporates a sparse variational loss term alongside the loss function of MAML, which uses a sparsifying approximated KL divergence as a regularizer, and demonstrates applicability of the approach in distributed sensor networks, where sparsity and meta-learning can be beneficial.

Decoder Choice Network for Meta-Learning

An alternative design of the decoder with a structure that shares certain weights, thereby reducing the number of required parameters is provided, which combines ensemble learning with the proposed approach to improve the overall learning performance.

Alpha MAML: Adaptive Model-Agnostic Meta-Learning

An extension to MAML is introduced to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates, and results with the Omniglot database demonstrate a substantial reduction in theneed to tune MAMl training hyperparameters and improvement to training stability with less sensitivity to hyperparam parameter choice.

Probabilistic Model-Agnostic Meta-Learning

This paper proposes a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution that is trained via a variational lower bound, and shows how reasoning about ambiguity can also be used for downstream active learning problems.

Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

A novel meta-learning based training procedure (MLTP) for DNNs is proposed and it is demonstrated that the meta- learning idea can indeed improve the generalization abilities ofDNNs on the standard supervised learning.

Meta-Learning in Neural Networks: A Survey

A new taxonomy is proposed that provides a more comprehensive breakdown of the space of meta-learning methods today and surveys promising applications and successes ofMeta-learning such as few-shot learning and reinforcement learning.


It is found that the upper bound of β depends on α, in contrast to the case of using the normal gradient descent method, and it is shown that the threshold of β increases as α approaches its own upper bound.

Meta-Learning with Adaptive Layerwise Metric and Subspace

This paper presents a feedforward neural network, referred to as T-net, where the linear transformation between two adjacent layers is decomposed as T W such that W is learned by task-specific learners and the transformation T is meta-learned to speed up the convergence of gradient updates for task- specific learners.



Meta-SGD: Learning to Learn Quickly for Few Shot Learning

Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, shows highly competitive performance for few-shot learning on regression, classification, and reinforcement learning.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning

This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.

Learning to reinforcement learn

This work introduces a novel approach to deep meta-reinforcement learning, which is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure.

On First-Order Meta-Learning Algorithms

A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task.

Overcoming catastrophic forgetting in neural networks

It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.

How to train your MAML

This paper proposes various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAMl, which it is called M AML++.

Meta-Learning with Memory-Augmented Neural Networks

The ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples is demonstrated.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

Matching Networks for One Shot Learning

This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.