• Corpus ID: 220363837

In the Wild: From ML Models to Pragmatic ML Systems

  title={In the Wild: From ML Models to Pragmatic ML Systems},
  author={Matthew Wallingford and Aditya Kusupati and Keivan Alizadeh-Vahid and Aaron Walsman and Aniruddha Kembhavi and Ali Farhadi},
Enabling robust intelligence in the wild entails learning systems that offer uninterrupted inference while affording sustained training, with varying amounts of data & supervision. Such a pragmatic ML system should be able to cope with the openness & flexibility inherent in the real world. The machine learning community has organically broken down this challenging task into manageable sub tasks such as supervised, few-shot, continual, & self-supervised learning; each affording distinctive… 
Bayesian Embeddings for Few-Shot Open World Recognition
This work combines Bayesian non-parametric class priors with an embedding-based pre-training scheme to yield a highly flexible framework which is referred to as few-shot learning for open world recognition (FLOWR), and benchmarks the framework on open-world extensions of the common MiniImageNet and TieredImageNet few- shot learning datasets.
Are We Learning Yet? A Meta-Review of Evaluation Failures Across Machine Learning
Many subfields of machine learning share a common stumbling block: evaluation. 1 Advances in machine learning often evaporate under closer scrutiny or turn out to 2 be less widely applicable than
Understanding the Role of Training Regimes in Continual Learning
This work hypothesizes that the geometrical properties of the local minima found for each task play an important role in the overall degree of forgetting, and studies the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks'Local minima and consequently, on helping it not to forget catastrophically.
Streaming Self-Training via Domain-Agnostic Unlabeled Images
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models such that a non-expert user can define a new task depending on their needs via a
Continual (sequential) training and multitask (simultaneous) training are often attempting to solve the same overall objective: to find a solution that performs well on all considered tasks. The main
Wide Neural Networks Forget Less Catastrophically
This work focuses on the model itself and study the impact of “width” of the neural network architecture on catastrophic forgetting, and shows that width has a surprisingly significant effect on forgetting.
Linear Mode Connectivity in Multitask and Continual Learning
It is empirically found that different minima of the same task are typically connected by very simple curves of low error, and this finding is exploited to propose an effective algorithm that constrains the sequentially learned minima to behave as the multitask solution.
One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification
This work proposes a multi-scale, cascaded recurrent neural network architecture, MSC-RNN, comprised of an efficient multi-instance learning (MIL) Recurrent Neural Network (RNN) for clutter discrimination at a lower tier, and a more complex RNN classifier for source classification at the upper tier.


GDumb: A Simple Approach that Questions Our Progress in Continual Learning
We discuss a general formulation for the Continual Learning (CL) problem for classification—a learning task where a stream provides samples to a learner and the goal of the learner, depending on the
Continual Unsupervised Representation Learning
The proposed approach (CURL) performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting.
Recent Advances in Autoencoder-Based Representation Learning
An in-depth review of recent advances in representation learning with a focus on autoencoder-based models and makes use of meta-priors believed useful for downstream tasks, such as disentanglement and hierarchical organization of features.
Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference
This work proposes a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples, and introduces a new algorithm, Meta-Experience Replay, that directly exploits this view by combining experience replay with optimization based meta-learning.
Optimization as a Model for Few-Shot Learning
Prototypical Networks for Few-shot Learning
This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning.
Meta-Transfer Learning for Few-Shot Learning
A novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks and introduces the hard task (HT) meta-batch scheme as an effective learning curriculum for MTL.
BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning
BatchEnsemble is proposed, an ensemble method whose computational and memory costs are significantly lower than typical ensembles and can easily scale up to lifelong learning on Split-ImageNet which involves 100 sequential learning tasks.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning
Task-Free Continual Learning
This work investigates how to transform continual learning to an online setup, and develops a system that keeps on learning over time in a streaming fashion, with data distributions gradually changing and without the notion of separate tasks.