Generative Models for Deep Learning with Very Scarce Data

  title={Generative Models for Deep Learning with Very Scarce Data},
  author={Juan Maro{\~n}as Molano and Roberto Paredes Palacios and Daniel Ramos-Castro},
The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of… 

Active learning without unlabeled samples: generating questions and labels using Monte Carlo Tree Search

This work applies the Recurrent Neural Network and Monte Carlo Tree Search to generate unlabelled questions and uses Human-In-the-Loop to help decide whether the generated questions are meaningful or not and shows that generated data leads to improved classification performance in comparison to the vanilla dataset.

Data Augmentation Using Generative Adversarial Networks

This thesis deals with balancing image datasets by data augmentation using generative adversarial neural networks, which is a process known as class balancing and suggests how the performance of the methods proportionately deteriorates with increasing imbalance rate and diversity of datasets.

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation

This work instantiates model patching with CAMEL, which uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and balances subgroup performance using a theoretically-motivated subgroup consistency regularizer, accompanied by a new robust objective.



A Bayesian Data Augmentation Approach for Learning Deep Models

A novel Bayesian formulation to data augmentation is provided, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set, and this approach produces better classification results than similar GAN models.

Generative Adversarial Nets

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a

Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference

This work presents an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches, and approximate the model's intractable posterior with Bernoulli variational distributions.

Semi-supervised Learning with Ladder Networks

This work builds on top of the Ladder network proposed by Valpola which is extended by combining the model with supervision and shows that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Generalized Denoising Auto-Encoders as Generative Models

A different attack on the problem is proposed, which deals with arbitrary (but noisy enough) corruption, arbitrary reconstruction loss, handling both discrete and continuous-valued variables, and removing the bias due to non-infinitesimal corruption noise.

Learning Deep Architectures for AI

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

Dropout: a simple way to prevent neural networks from overfitting

It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

On Contrastive Divergence Learning

The properties of CD learning are studied and it is shown that it provides biased estimates in general, but that the bias is typically very small.