Data Augmentation Using GANs
@article{Tanaka2019DataAU, title={Data Augmentation Using GANs}, author={Fabio Henrique Kiyoiti dos Santos Tanaka and Claus de Castro Aranha}, journal={ArXiv}, year={2019}, volume={abs/1904.09135} }
In this paper we propose the use of Generative Adversarial Networks (GAN) to generate artificial training data for machine learning tasks. The generation of artificial training data can be extremely useful in situations such as imbalanced data sets, performing a role similar to SMOTE or ADASYN. It is also useful when the data contains sensitive information, and it is desirable to avoid using the original data set as much as possible (example: medical data). We test our proposal on benchmark…
Figures and Tables from this paper
99 Citations
GAN-Based Image Data Augmentation
- Computer Science
- 2020
It is demonstrated that training classifiers on purely synthetic data achieves comparable results to those trained solely on pure data and that for small sets of training data, augmenting the dataset by first training GANs on the data can lead to dramatic improvement in classifier performance.
Improving Software Defect Prediction using Generative Adversarial Networks
- Computer ScienceInternational journal of Science and Engineering Applications
- 2020
This work tries to improve the software defect prediction performance in projects, where the data available is less and imbalanced, using Generative Adversarial Networks (GANs).
Synthesising Tabular Data using Wasserstein Conditional GANs with Gradient Penalty (WCGAN-GP)
- Computer ScienceAICS
- 2020
This study applies Wasserstein Conditional Generative Adversarial Network (WCGAN-GP) to the task of generating tabular synthetic data that is indistinguishable from the real data, without incurring information leakage.
StyleGANs and Transfer Learning for Generating Synthetic Images in Industrial Applications
- Computer ScienceSymmetry
- 2021
This paper evaluates a StyleGAN generative model with transfer learning on different application domains—training with paintings, portraits, Pokémon, bedrooms, and cats—to generate target images with different levels of content variability and found that StyleGAN with transferLearning produced good quality images.
Enhancing the Classification of EEG Signals using Wasserstein Generative Adversarial Networks
- Computer Science2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)
- 2020
The preliminary results suggest that the introduction of artificially generated signals have a positive effect on the performance of the classifier, and a method to quantify the level of information which indicates that the generated signals indeed follow the properties of the real ones.
Regularising Deep Networks with DGMs
- Computer ScienceArXiv
- 2019
A new method for regularising neural networks where a density estimator over the activations of all layers of the model is developed, which implies that although decisions are broadly similar, this approach provides a network with better calibrated uncertainty measures over the class posteriors.
Intra-Class Cutmix for Unbalanced Data Augmentation
- Computer ScienceICMLC
- 2021
This paper proposes a data augmentation strategy called Intra-Class Cutmix for unbalanced datasets that can enhance the learning ability of neural networks for minority classes by mixing the intra-class samples of minority classes, and correct the decision boundary affected by un balanced datasets.
Regularising Deep Networks using Deep Generative Models
- Computer Science
- 2019
This work develops a new method for regularising neural networks that learns a probability distribution over the activations of all layers of the model and then inserts imputed values into the network during training, leading to networks with better calibrated uncertainty over the class posteriors all the while delivering greater test-set accuracy.
Not Enough Data? Deep Learning to the Rescue!
- Computer ScienceArXiv
- 2019
This work uses a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning and shows that LAMBADA improves classifiers' performance on a variety of datasets.
A fuzzy data augmentation technique to improve regularisation
- Computer ScienceInternational Journal of Intelligent Systems
- 2021
This paper proposes a data augmentation technique based on Fuzzy C‐Means clustering and fuzzy membership grades that is used to improve the generalisation of a Deep Neural Network that is suitable for numerical data and manages to balance the classification model's bias‐variance tradeoff.
References
SHOWING 1-10 OF 15 REFERENCES
Synthetic data augmentation using GAN for improved liver lesion classification
- Computer Science2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)
- 2018
A training scheme that first uses classical data augmentation to enlarge the training set and then further enlarges the data size and its diversity by applying GAN techniques for synthetic data augmentation is proposed.
BAGAN: Data Augmentation with Balancing GAN
- Computer ScienceArXiv
- 2018
This work proposes balancing GAN (BAGAN) as an augmentation tool to restore balance in imbalanced datasets and compares the proposed methodology with state-of-the-art GANs and demonstrates that BAGAN generates images of superior quality when trained with an imbalanced dataset.
The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology
- Computer ScienceOncotarget
- 2017
This paper presents the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters, developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator.
Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Computer ScienceICLR
- 2018
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality.
ADASYN: Adaptive synthetic sampling approach for imbalanced learning
- Computer Science2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
- 2008
Simulation analyses on several machine learning data sets show the effectiveness of the ADASYN sampling approach across five evaluation metrics.
Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
- Computer ScienceICLR
- 2016
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information…
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
SRGAN, a generative adversarial network (GAN) for image super-resolution (SR), is presented, to its knowledge, the first framework capable of inferring photo-realistic natural images for 4x upscaling factors and a perceptual loss function which consists of an adversarial loss and a content loss.
Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus
- Computer Science
- 1988
Testing the ability of an early neural network model, ADAP, to forecast the onset of diabetes mellitus in a high risk population of Pima Indians and comparing the results with those obtained from logistic regression and linear perceptron models using precisely the same training and forecasting sets.
Generative Adversarial Nets
- Computer ScienceNIPS
- 2014
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a…
A Novel Boundary Oversampling Algorithm Based on Neighborhood Rough Set Model: NRSBoundary-SMOTE
- Computer Science
- 2013
After conducting an experiment on four kinds of classifiers, NRSBoundary-SMOTE has higher accuracy than other methods when C4.5, CART, and KNN are used but it is worse than SMOTE on classifier SVM.