Corpus ID: 6719686

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

@inproceedings{Finn2017ModelAgnosticMF,
  title={Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks},
  author={Chelsea Finn and P. Abbeel and Sergey Levine},
  booktitle={ICML},
  year={2017}
}
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. [...] Key Method In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot…Expand
First-order Meta-Learned Initialization for Faster Adaptation in Deep Reinforcement Learning
Deep neural networks excel in regimes with large amounts of data, but tend to struggle when data is scarce or when they need to adapt quickly to changes in the task. Recent works have proposedExpand
Regularizing Meta-Learning via Gradient Dropout
TLDR
This paper introduces a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning by randomly dropping the gradient in the inner-loop optimization of each parameter in deep neural networks, such that the augmented gradients improve generalization to new tasks. Expand
B-Small: A Bayesian Neural Network Approach to Sparse Model-Agnostic Meta-Learning
  • Anish Madan, Ranjitha Prasad
  • Computer Science
  • ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
TLDR
The proposed framework incorporates a sparse variational loss term alongside the loss function of MAML, which uses a sparsifying approximated KL divergence as a regularizer, and demonstrates applicability of the approach in distributed sensor networks, where sparsity and meta-learning can be beneficial. Expand
Alpha MAML: Adaptive Model-Agnostic Meta-Learning
TLDR
An extension to MAML is introduced to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates, and results with the Omniglot database demonstrate a substantial reduction in theneed to tune MAMl training hyperparameters and improvement to training stability with less sensitivity to hyperparam parameter choice. Expand
Probabilistic Model-Agnostic Meta-Learning
TLDR
This paper proposes a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution that is trained via a variational lower bound, and shows how reasoning about ambiguity can also be used for downstream active learning problems. Expand
Decoupling Adaptation from Modeling with Meta-Optimizers for Meta Learning
TLDR
This work begins with an experimental analysis of MAML, finding that deep models are crucial for its success, even given sets of simple tasks where a linear model would suffice on any individual task. Expand
ROLE OF TWO LEARNING RATES IN CONVERGENCE OF MODEL-AGNOSTIC META-LEARNING
Model-agnostic meta-learning (MAML) is known as a powerful meta-learning method. However, MAML is notorious for being hard to train because of the existence of two learning rates. Therefore, in thisExpand
Meta-Learning with Adaptive Layerwise Metric and Subspace
TLDR
This paper presents a feedforward neural network, referred to as T-net, where the linear transformation between two adjacent layers is decomposed as T W such that W is learned by task-specific learners and the transformation T is meta-learned to speed up the convergence of gradient updates for task- specific learners. Expand
Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation
TLDR
This paper proposes a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior parameters according to the identified mode, allowing more efficient fast adaptation and demonstrating the effectiveness of the model in modulating the meta-learning prior in response to the characteristics of tasks. Expand
Decoder Choice Network for Meta-Learning
TLDR
This work proposes a method that controls the gradient descent process of the model parameters of a neural network by limiting the models parameters in a low-dimensional latent space and introduces ensemble learning to work with the proposed approach for improving performance. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 59 REFERENCES
Optimization as a Model for Few-Shot Learning
Meta-SGD: Learning to Learn Quickly for Few Shot Learning
TLDR
Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, shows highly competitive performance for few-shot learning on regression, classification, and reinforcement learning. Expand
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
TLDR
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods. Expand
Learning to reinforcement learn
TLDR
This work introduces a novel approach to deep meta-reinforcement learning, which is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. Expand
On First-Order Meta-Learning Algorithms
TLDR
A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task. Expand
Overcoming catastrophic forgetting in neural networks
TLDR
It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks. Expand
How to train your MAML
TLDR
This paper proposes various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAMl, which it is called M AML++. Expand
Meta-Learning with Memory-Augmented Neural Networks
TLDR
The ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples is demonstrated. Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
TLDR
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
Matching Networks for One Shot Learning
TLDR
This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. Expand
...
1
2
3
4
5
...