• Corpus ID: 384205

Learning to Compose Domain-Specific Transformations for Data Augmentation

@article{Ratner2017LearningTC,
  title={Learning to Compose Domain-Specific Transformations for Data Augmentation},
  author={Alexander J. Ratner and Henry R. Ehrenberg and Zeshan Hussain and Jared A. Dunnmon and Christopher R{\'e}},
  journal={Advances in neural information processing systems},
  year={2017},
  volume={30},
  pages={
          3239-3249
        }
}
Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. [] Key Method Our method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data. The learned transformation model can then be used to perform data augmentation for any end discriminative model. In our experiments, we show the efficacy of our approach on…

Figures and Tables from this paper

PAGANDA : An Adaptive Task-Independent Automatic Data Augmentation
TLDR
This paper demonstrates by experiments that the proposed Parallel Adaptive GAN Data Augmentation (PAGANDA) strategy can be easily adapted to cross-domain deep learning/machine learning tasks such as image classification and image inpainting, while significantly improving model performance in both tasks.
Semantic Perturbations with Normalizing Flows for Improved Generalization
TLDR
It is found that the latent adversarial perturbations adaptive to the classifier throughout its training are most effective, yielding the first test accuracy improvement results on real-world datasets—CIFAR-10/100—via latent-space perturbation.
Safe Augmentation: Learning Task-Specific Transformations from Data
TLDR
This work proposes a simple novel method that can automatically learn task-specific data augmentation techniques called safe augmentations that do not break the data distribution and can be used to improve model performance.
Adversarial Learning of General Transformations for Data Augmentation
TLDR
This work learns data augmentation directly from the training data by learning to transform images with an encoder-decoder architecture combined with a spatial transformer network.
Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data
TLDR
Adversarial Data Programming is presented, which presents an adversarial methodology to generate data as well as a curated aggregated label, given a set of weak labeling functions, and it outperformed many state-of-the-art models.
Learning to Generate Synthetic Data via Compositing
TLDR
A task-specific approach to synthetic data generation that employs a trainable synthesizer network that is optimized to produce meaningful training samples by assessing the strengths and weaknesses of a ‘target’ classifier and trained in an adversarial manner.
Data Augmentation via Structured Adversarial Perturbations
TLDR
This work proposes a method to generate adversarial examples that maintain some desired natural structure and demonstrates this approach through two types of image transformations: photometric and geometric.
Generative Adversarial Data Programming
TLDR
This work presents Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label, given a set of weak labeling functions.
Deep Adversarial Data Augmentation for Extremely Low Data Regimes
TLDR
This work elaborately formulate data augmentation as a problem of training a class-conditional and supervised generative adversarial network (GAN) and proposes a new discriminator loss to fit the goal ofData augmentation, through which both real and augmented samples are enforced to contribute to and be consistent in finding the decision boundaries.
On the Generalization Effects of Linear Transformations in Data Augmentation
TLDR
This work considers a family of linear transformations and study their effects on the ridge estimator in an over-parametrized linear regression setting, and proposes an augmentation scheme that searches over the space of transformations by how uncertain the model is about the transformed data.
...
...

References

SHOWING 1-10 OF 37 REFERENCES
Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation
TLDR
This work aligns image pairs within each class under the assumption that the spatial transformation between images belongs to a large class of diffeomorphisms, and learns a class-specific probabilistic generative models of the transformations in a Riemannian submanifold of the Lie group of diffEomorphisms.
Dataset Augmentation in Feature Space
TLDR
This paper adopts a simpler, domain-agnostic approach to dataset augmentation, and works in the space of context vectors generated by sequence-to-sequence models, demonstrating a technique that is effective for both static and sequential data.
Adaptive data augmentation for image classification
TLDR
A new automatic and adaptive algorithm for choosing the transformations of the samples used in data augmentation, where for each sample, the main idea is to seek a small transformation that yields maximal classification loss on the transformed sample.
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
RenderGAN: Generating Realistic Labeled Data
TLDR
This work presents a novel framework called RenderGAN that can generate large amounts of realistic, labeled images by combining a 3D model and the Generative Adversarial Network framework, and applies it to generate images of barcode-like markers that are attached to honeybees.
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
TLDR
An unsupervised loss function is proposed that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network.
Distributional Smoothing with Virtual Adversarial Training
TLDR
When the LDS based regularization was applied to supervised and semi-supervised learning for the MNIST dataset, it outperformed all the training methods other than the current state of the art method, which is based on a highly advanced generative model.
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a
Conditional Generative Adversarial Nets
TLDR
The conditional version of generative adversarial nets is introduced, which can be constructed by simply feeding the data, y, to the generator and discriminator, and it is shown that this model can generate MNIST digits conditioned on class labels.
Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information
...
...