• Corpus ID: 235727238

Memory Efficient Meta-Learning with Large Images

@inproceedings{Bronskill2021MemoryEM,
  title={Memory Efficient Meta-Learning with Large Images},
  author={John Bronskill and Daniela Massiceti and Massimiliano Patacchiola and Katja Hofmann and Sebastian Nowozin and Richard E. Turner},
  booktitle={NeurIPS},
  year={2021}
}
Meta learning approaches to few-shot classification are computationally efficient at test time, requiring just a few optimization steps or single forward pass to learn a new task, but they remain highly memory-intensive to train. This limitation arises because a task’s entire support set, which can contain up to 1000 images, must be processed before an optimization step can be taken. Harnessing the performance gains offered by large images thus requires either parallelizing the meta-learner… 
Skill-based Meta-Reinforcement Learning
TLDR
Experimental results on continuous control tasks in navigation and manipulation demonstrate that the proposed method can efficiently solve longhorizon novel target tasks by combining the strengths of meta-learning and the usage of offline datasets, while prior approaches in RL, meta-RL, and multi-task RL require substantially more environment interactions to solve the tasks.

References

SHOWING 1-10 OF 38 REFERENCES
Optimization as a Model for Few-Shot Learning
Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes
The goal of this paper is to design image classification systems that, after an initial multi-task training phase, can automatically adapt to new tasks encountered at test time. We introduce a
Training Deep Nets with Sublinear Memory Cost
TLDR
This work designs an algorithm that costs O( √ n) memory to train a n layer network, with only the computational cost of an extra forward pass per mini-batch, and shows that it is possible to trade computation for memory giving a more memory efficient training algorithm with a little extra computation cost.
Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark
TLDR
A cross-family study of the best transfer and meta learners on both a large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task Adaptation Benchmark, VTAB) finds that, on average, large- scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet.
Matching Networks for One Shot Learning
TLDR
This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.
Meta-Learning with Implicit Gradients
TLDR
Theoretically, it is proved that implicit MAML can compute accurate meta-gradients with a memory footprint that is, up to small constant factors, no more than that which is required to compute a single inner loop gradient and at no overall increase in the total computational cost.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning
Defining Benchmarks for Continual Few-Shot Learning
TLDR
This paper defines a theoretical framework for continual few-shot learning, then proposes a range of flexible benchmarks that unify the evaluation criteria and allows exploring the problem from multiple perspectives, and introduces a compact variant of ImageNet, called SlimageNet64, which retains all original 1000 classes but only contains 200 instances of each one.
On First-Order Meta-Learning Algorithms
TLDR
A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task.
How to train your MAML
TLDR
This paper proposes various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAMl, which it is called M AML++.
...
1
2
3
4
...