• Corpus ID: 211532796

Understanding and Enhancing Mixed Sample Data Augmentation

@article{Harris2020UnderstandingAE,
  title={Understanding and Enhancing Mixed Sample Data Augmentation},
  author={Ethan Harris and Antonia Marcu and Matthew Painter and Mahesan Niranjan and Adam Pr{\"u}gel-Bennett and Jonathon S. Hare},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.12047}
}
Mixed Sample Data Augmentation (MSDA) has received increasing attention in recent years, with many successful variants such as MixUp and CutMix. Following insight on the efficacy of CutMix in particular, we propose FMix, an MSDA that uses binary masks obtained by applying a threshold to low frequency images sampled from Fourier space. FMix improves performance over MixUp and CutMix for a number of state-of-theart models across a range of data sets and problem settings. We go on to analyse MixUp… 
Grid cut and mix: flexible and efficient data augmentation
TLDR
This paper combines the advantage of information dropping and information mixing, and proposes Gridcut and Gridmix, a structured deletion method based on the structured deletion of the input images like Gridmask that can be flexibly expanded into Cutout, Cutmix and Gridmask.
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation
TLDR
A novel Class-Discrimination metric is presented to quantitatively measure this dichotomy in performance and link it to the discriminative capacity induced by the different strategies on a network's latent space.
ChannelMix: A Mixed Sample Data Augmentation Strategy for Image Classification
TLDR
This paper proposes a novel mixed sample data augmentation (MSDA) strategy named ChannelMix that uses the multi-channel information and labels of paired samples to regularize the training process through convex combination, which can guide the network to pay more attention to the less discriminative parts.
DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic Segmentation
TLDR
A novel Dual Soft-Paste method that facilitates the model learning domain-invariant features from the intermediate domains, leading to faster convergence and better performance.
On the Effects of Data Distortion on Model Analysis and Training
TLDR
Current shape bias identification methods and occlusion robustness measures are biased and a fairer alternative is proposed and it is argued that the impact of the artefacts must be understood and exploited rather than eliminated.
Causal Explanations of Image Misclassifications
TLDR
To reduce the misclassifications caused by non-essential information interference, this study erases the pixels within the bonding boxes anchored at the top 5% pixels of the saliency map.
On Data-centric Myths
TLDR
The theoretical directions relating to what aspects of the data matter are analyzed and it is shown that 1) data dimension should not necessarily be minimised and 2) when manipulating data, preserving the distribution is inessential.
Milking CowMask for Semi-Supervised Image Classification
TLDR
A novel mask-based augmentation method called CowMask is presented, using it to provide perturbations for semi-supervised consistency regularization, which achieves a state-of-the-art result on ImageNet with 10% labeled data.
ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning
TLDR
This work proposes a novel data augmentation mechanism called ClassMix, which generates augmentations by mixing unlabelled samples, by leveraging on the network’s predictions for respecting object boundaries, and attains state-of-the-art results.
...
1
2
...

References

SHOWING 1-10 OF 33 REFERENCES
Improved Mixed-Example Data Augmentation
TLDR
This work aims to explore a new, more generalized form of this type of data augmentation in order to determine whether such linearity is necessary, and finds a much larger space of practical augmentation techniques, including methods that improve upon previous state-of-the-art.
MixUp as Locally Linear Out-Of-Manifold Regularization
TLDR
An understanding is developed for MixUp as a form of “out-of-manifold regularization”, which imposes certain “local linearity” constraints on the model’s input space beyond the data manifold, which enables a novel adaptive version of MixUp, where the mixing policies are automatically learned from the data using an additional network and objective function designed to avoid manifold intrusion.
Understanding Mixup Training Methods
TLDR
A spatial mixup approach was proposed that achieved the state-of-the-art performance on the CIFAR and ImageNet data sets and enables the generative adversarial nets to have more stable training process and more diverse sample generation ability.
Improved Regularization of Convolutional Neural Networks with Cutout
TLDR
This paper shows that the simple regularization technique of randomly masking out square regions of input during training, which is called cutout, can be used to improve the robustness and overall performance of convolutional neural networks.
Data Augmentation by Pairing Samples for Images Classification
TLDR
This paper introduces a simple but surprisingly effective data augmentation technique for image classification tasks, named SamplePairing, which significantly improved classification accuracy for all the tested datasets and is more valuable for tasks with a limited amount of training data, such as medical imaging tasks.
Data Augmentation Using Random Image Cropping and Patching for Deep CNNs
TLDR
A new data augmentation technique called random image cropping and patching (RICAP) which randomly crops four images and patches them to create a new training image and achieves a new state-of-the-art test error of 2.19% on CIFAR-10.
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
TLDR
Patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches, and CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task.
mixup: Beyond Empirical Risk Minimization
TLDR
This work proposes mixup, a simple learning principle that trains a neural network on convex combinations of pairs of examples and their labels, which improves the generalization of state-of-the-art neural network architectures.
Deep Pyramidal Residual Networks
TLDR
This research gradually increases the feature map dimension at all units to involve as many locations as possible in the network architecture and proposes a novel residual unit capable of further improving the classification accuracy with the new network architecture.
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.
...
1
2
3
4
...