• Corpus ID: 219708686

When MAML Can Adapt Fast and How to Assist When It Cannot

  title={When MAML Can Adapt Fast and How to Assist When It Cannot},
  author={S{\'e}bastien M. R. Arnold and Shariq Iqbal and Fei Sha},
  booktitle={International Conference on Artificial Intelligence and Statistics},
Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. On the other hand, we have just started to understand and analyze how they are able to adapt fast to new tasks. For example, one popular hypothesis is that the algorithms learn good representations for transfer, as in multi-task learning. In this work, we contribute by providing a series of empirical and theoretical studies, and discover several interesting yet… 

Modular Meta-Learning with Shrinkage

This work develops general techniques based on Bayesian shrinkage to automatically discover and learn both task-specific and general reusable modules and demonstrates that this method outperforms existing meta-learning approaches in domains like few-shot text-to-speech that have little task data and long adaptation horizons.

learn2learn: A Library for Meta-Learning Research

Meta-learning researchers face two fundamental issues in their empirical work: prototyping and reproducibility. Researchers are prone to make mistakes when prototyping new algorithms and tasks

A Channel Coding Benchmark for Meta-Learning

This work proposes the channel coding problem as a benchmark for meta- learning and uses this benchmark to study several aspects of meta-learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem.

MAC: A Meta-Learning Approach for Feature Learning and Recombination

The width-depth duality of neural networks is invoked, wherein, the width of the network is increased by adding extra computational units (ACU) to enable the learning of new atomic features in the meta-testing task, and the associated increased width facilitates information propagation in the forwarding pass.

Improving Offline Handwritten Text Recognition Using Conditional Writer-Specific Knowledge

  • Computer Science
  • 2022
Results show that an HTR-specific version of MAML known as MetaH TR improves performance compared to the baseline with a 1.4 to 2.0 improvement in word error rate, and it is shown that a deeper model lends itself better to adaptation using MetaHTR than a shallower model.

Improving GNN-based accelerator design automation with meta learning

Experiments show the MAML-enhanced model outperforms a simple baseline based on fine tuning in terms of both offline evaluation on hold-out test sets and online evaluation for DSE speedup results.

Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks

This work proposes a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks ( L IMIT), which synthesizes fake F SCIL tasks from the base dataset, and can build a generalizable feature space for the unseen tasks through meta- learning.

A Channel Coding Benchmark for Meta-Learning

This work proposes the channel coding problem as a benchmark for meta-learning, and uses the MetaCC benchmark to study several aspects of meta- learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem.

Forward Compatible Few-Shot Class-Incremental Learning

Virtual prototypes are assigned to squeeze the embedding of known classes and reserve for new classes, which allow the model to accept possible updates in the future and act as proxies scattered among embedding space to build a stronger classifier during inference.



Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

The ANIL (Almost No Inner Loop) algorithm is proposed, a simplification of MAML where the inner loop is removed for all but the (task-specific) head of a MAMl-trained network, and performance on the test tasks is entirely determined by the quality of the learned features, and one can remove even the head of the network (the NIL algorithm).

On First-Order Meta-Learning Algorithms

A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning

Learning to Learn with Gradients

This thesis discusses gradient-based algorithms for learning to learn, or meta-learning, which aim to endow machines with flexibility akin to that of humans, and shows how these methods can be extended for applications in motor control by combining elements of meta- learning with techniques for deep model-based reinforcement learning, imitation learning, and inverse reinforcement learning.

Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm

This paper finds that deep representation combined with standard gradient descent have sufficient capacity to approximate any learning algorithm, and finds that gradient-based meta-learning consistently leads to learning strategies that generalize more widely compared to those represented by recurrent models.

Meta-learning with differentiable closed-form solvers

The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data.

Alpha MAML: Adaptive Model-Agnostic Meta-Learning

An extension to MAML is introduced to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates, and results with the Omniglot database demonstrate a substantial reduction in theneed to tune MAMl training hyperparameters and improvement to training stability with less sensitivity to hyperparam parameter choice.

Learned Optimizers that Scale and Generalize

This work introduces a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead, by introducing a novel hierarchical RNN architecture with minimal per-parameter overhead.

Meta-SGD: Learning to Learn Quickly for Few Shot Learning

Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, shows highly competitive performance for few-shot learning on regression, classification, and reinforcement learning.

Meta Learning via Learned Loss

This paper presents a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures, and develops a pipeline for “meta-training” such loss functions, targeted at maximizing the performance of the model trained under them.