• Corpus ID: 126180363

Data Augmentation Using GANs

  title={Data Augmentation Using GANs},
  author={Fabio Henrique Kiyoiti dos Santos Tanaka and Claus de Castro Aranha},
In this paper we propose the use of Generative Adversarial Networks (GAN) to generate artificial training data for machine learning tasks. The generation of artificial training data can be extremely useful in situations such as imbalanced data sets, performing a role similar to SMOTE or ADASYN. It is also useful when the data contains sensitive information, and it is desirable to avoid using the original data set as much as possible (example: medical data). We test our proposal on benchmark… 
GAN-Based Image Data Augmentation
It is demonstrated that training classifiers on purely synthetic data achieves comparable results to those trained solely on pure data and that for small sets of training data, augmenting the dataset by first training GANs on the data can lead to dramatic improvement in classifier performance.
Improving Software Defect Prediction using Generative Adversarial Networks
This work tries to improve the software defect prediction performance in projects, where the data available is less and imbalanced, using Generative Adversarial Networks (GANs).
Synthesising Tabular Data using Wasserstein Conditional GANs with Gradient Penalty (WCGAN-GP)
This study applies Wasserstein Conditional Generative Adversarial Network (WCGAN-GP) to the task of generating tabular synthetic data that is indistinguishable from the real data, without incurring information leakage.
StyleGANs and Transfer Learning for Generating Synthetic Images in Industrial Applications
This paper evaluates a StyleGAN generative model with transfer learning on different application domains—training with paintings, portraits, Pokémon, bedrooms, and cats—to generate target images with different levels of content variability and found that StyleGAN with transferLearning produced good quality images.
Enhancing the Classification of EEG Signals using Wasserstein Generative Adversarial Networks
The preliminary results suggest that the introduction of artificially generated signals have a positive effect on the performance of the classifier, and a method to quantify the level of information which indicates that the generated signals indeed follow the properties of the real ones.
Regularising Deep Networks with DGMs
A new method for regularising neural networks where a density estimator over the activations of all layers of the model is developed, which implies that although decisions are broadly similar, this approach provides a network with better calibrated uncertainty measures over the class posteriors.
Intra-Class Cutmix for Unbalanced Data Augmentation
This paper proposes a data augmentation strategy called Intra-Class Cutmix for unbalanced datasets that can enhance the learning ability of neural networks for minority classes by mixing the intra-class samples of minority classes, and correct the decision boundary affected by un balanced datasets.
Regularising Deep Networks using Deep Generative Models
This work develops a new method for regularising neural networks that learns a probability distribution over the activations of all layers of the model and then inserts imputed values into the network during training, leading to networks with better calibrated uncertainty over the class posteriors all the while delivering greater test-set accuracy.
Not Enough Data? Deep Learning to the Rescue!
This work uses a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning and shows that LAMBADA improves classifiers' performance on a variety of datasets.
A fuzzy data augmentation technique to improve regularisation
This paper proposes a data augmentation technique based on Fuzzy C‐Means clustering and fuzzy membership grades that is used to improve the generalisation of a Deep Neural Network that is suitable for numerical data and manages to balance the classification model's bias‐variance tradeoff.


Synthetic data augmentation using GAN for improved liver lesion classification
A training scheme that first uses classical data augmentation to enlarge the training set and then further enlarges the data size and its diversity by applying GAN techniques for synthetic data augmentation is proposed.
BAGAN: Data Augmentation with Balancing GAN
This work proposes balancing GAN (BAGAN) as an augmentation tool to restore balance in imbalanced datasets and compares the proposed methodology with state-of-the-art GANs and demonstrates that BAGAN generates images of superior quality when trained with an imbalanced dataset.
The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology
This paper presents the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters, developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator.
Progressive Growing of GANs for Improved Quality, Stability, and Variation
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality.
ADASYN: Adaptive synthetic sampling approach for imbalanced learning
Simulation analyses on several machine learning data sets show the effectiveness of the ADASYN sampling approach across five evaluation metrics.
Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
SRGAN, a generative adversarial network (GAN) for image super-resolution (SR), is presented, to its knowledge, the first framework capable of inferring photo-realistic natural images for 4x upscaling factors and a perceptual loss function which consists of an adversarial loss and a content loss.
Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus
Testing the ability of an early neural network model, ADAP, to forecast the onset of diabetes mellitus in a high risk population of Pima Indians and comparing the results with those obtained from logistic regression and linear perceptron models using precisely the same training and forecasting sets.
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a
A Novel Boundary Oversampling Algorithm Based on Neighborhood Rough Set Model: NRSBoundary-SMOTE
After conducting an experiment on four kinds of classifiers, NRSBoundary-SMOTE has higher accuracy than other methods when C4.5, CART, and KNN are used but it is worse than SMOTE on classifier SVM.