Corpus ID: 219708686

When MAML Can Adapt Fast and How to Assist When It Cannot

  title={When MAML Can Adapt Fast and How to Assist When It Cannot},
  author={S{\'e}bastien M. R. Arnold and Shariq Iqbal and Fei Sha},
Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. On the other hand, we have just started to understand and analyze how they are able to adapt fast to new tasks. For example, one popular hypothesis is that the algorithms learn good representations for transfer, as in multi-task learning. In this work, we contribute by providing a series of empirical and theoretical studies, and discover several interesting yet… Expand
learn2learn: A Library for Meta-Learning Research
Meta-learning researchers face two fundamental issues in their empirical work: prototyping and reproducibility. Researchers are prone to make mistakes when prototyping new algorithms and tasksExpand
A Channel Coding Benchmark for Meta-Learning
  • Rui Li
  • 2021
Meta-learning provides a popular and effective family of methods for data-efficient learning of new tasks. However, several important issues in meta-learning have proven hard to study thus far. ForExpand
A Channel Coding Benchmark for Meta-Learning
This work proposes the channel coding problem as a benchmark for meta- learning and uses this benchmark to study several aspects of meta-learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem. Expand
A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning
This work argues that the trainvalidation split encourages the learned representation to be low-rank without compromising on expressivity, as opposed to the non-splitting variant that encourages high-rank representations. Expand
Co-Transport for Class-Incremental Learning
CO-transport for class Incremental Learning (COIL), which learns to relate across incremental tasks with the class-wise semantic relationship, is proposed, which efficiently adapts to new tasks, and stably resists forgetting. Expand
How Important is the Train-Validation Split in Meta-Learning?
A detailed theoretical study on whether and when the train-validation split is helpful on the linear centroid meta-learning problem, in the asymptotic setting where the number of tasks goes to infinity, and results highlight that data splitting may not always be preferable, especially when the data is realizable by the model. Expand
Offline Meta-Reinforcement Learning with Advantage Weighting
This paper introduces the offline meta-reinforcement learning (offline meta-RL) problem setting and proposes an algorithm that performs well in this setting, and proposes MACAW, an optimization-based meta-learning algorithm that uses simple, supervised regression objectives for both the inner and outer loop of meta-training. Expand
Pruning Meta-Trained Networks for On-Device Adaptation
Adapting-aware network pruning (ANP) is proposed, a novel pruning scheme that works with existing meta-learning methods for a compact network capable of fast adaptation and uses weight importance metric based on the sensitivity of the meta-objective rather than the conventional loss function. Expand
Modular Meta-Learning with Shrinkage
This work develops general techniques based on Bayesian shrinkage to automatically discover and learn both task-specific and general reusable modules and demonstrates that this method outperforms existing meta-learning approaches in domains like few-shot text-to-speech that have little task data and long adaptation horizons. Expand


Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
The ANIL (Almost No Inner Loop) algorithm is proposed, a simplification of MAML where the inner loop is removed for all but the (task-specific) head of a MAMl-trained network, and performance on the test tasks is entirely determined by the quality of the learned features, and one can remove even the head of the network (the NIL algorithm). Expand
On First-Order Meta-Learning Algorithms
A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task. Expand
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learningExpand
Learning to Learn with Gradients
This thesis discusses gradient-based algorithms for learning to learn, or meta-learning, which aim to endow machines with flexibility akin to that of humans, and shows how these methods can be extended for applications in motor control by combining elements of meta- learning with techniques for deep model-based reinforcement learning, imitation learning, and inverse reinforcement learning. Expand
Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm
This paper finds that deep representation combined with standard gradient descent have sufficient capacity to approximate any learning algorithm, and finds that gradient-based meta-learning consistently leads to learning strategies that generalize more widely compared to those represented by recurrent models. Expand
Meta-learning with differentiable closed-form solvers
The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data. Expand
Alpha MAML: Adaptive Model-Agnostic Meta-Learning
An extension to MAML is introduced to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates, and results with the Omniglot database demonstrate a substantial reduction in theneed to tune MAMl training hyperparameters and improvement to training stability with less sensitivity to hyperparam parameter choice. Expand
Learned Optimizers that Scale and Generalize
This work introduces a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead, by introducing a novel hierarchical RNN architecture with minimal per-parameter overhead. Expand
Meta-SGD: Learning to Learn Quickly for Few Shot Learning
Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, shows highly competitive performance for few-shot learning on regression, classification, and reinforcement learning. Expand
Meta Learning via Learned Loss
This paper presents a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures, and develops a pipeline for “meta-training” such loss functions, targeted at maximizing the performance of the model trained under them. Expand