Corpus ID: 198895147

Invariance reduces Variance: Understanding Data Augmentation in Deep Learning and Beyond

@article{Chen2019InvarianceRV,
  title={Invariance reduces Variance: Understanding Data Augmentation in Deep Learning and Beyond},
  author={Shuxiao Chen and E. Dobriban and Jane Lee},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.10905}
}
Many complex deep learning models have found success by exploiting symmetries in data. Convolutional neural networks (CNNs), for example, are ubiquitous in image classification due to their use of translation symmetry, as image identity is roughly invariant to translations. In addition, many other forms of symmetry such as rotation, scale, and color shift are commonly used via data augmentation: the transformed images are added to the training set. However, a clear framework for understanding… Expand
On the Benefits of Invariance in Neural Networks
TLDR
It is proved that training with data augmentation leads to better estimates of risk and gradients thereof, and a PAC-Bayes generalization bound is provided for models trained withData augmentation, it is shown that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC- Bayes bounds. Expand
On the Generalization Effects of Linear Transformations in Data Augmentation
TLDR
This work considers a family of linear transformations and study their effects on the ridge estimator in an over-parametrized linear regression setting, and proposes an augmentation scheme that searches over the space of transformations by how uncertain the model is about the transformed data. Expand
WeMix: How to Better Utilize Data Augmentation
TLDR
This work develops two novel algorithms, termed "AugDrop" and "MixLoss", to correct the data bias in the data augmentation, and proposes a generic algorithm "WeMix" by combining AugDrop and MixLoss, whose effectiveness is observed from extensive empirical evaluations. Expand
Data augmentation instead of explicit regularization
TLDR
The contribution on generalization of weight decay and dropout is not only superfluous when sufficient implicit regularization is provided, but also such techniques can dramatically deteriorate the performance if the hyperparameters are not carefully tuned for the architecture and data set. Expand
A Hessian Based Complexity Measure for Deep Networks
TLDR
A new measure for the complexity of the function generated by a deep network based on the integral of the norm of the tangent Hessian is developed, which shows that the oft-used heuristic of data augmentation imposes an implicit Hessian regularization during learning. Expand
CONTRASTIVE REPRESENTATION LEARNING
We propose methods to strengthen the invariance properties of representations obtained by contrastive learning. While existing approaches implicitly induce a degree of invariance as representationsExpand
ENHANCED CONVOLUTIONAL NEURAL KERNELS
  • 2019
Recent research shows that for training with `2 loss, convolutional neural networks (CNNs) whose width (number of channels in convolutional layers) goes to infinity correspond to regression withExpand
Probabilistic symmetry and invariant neural networks
TLDR
Drawing on tools from probability and statistics, a link between functional and probabilistic symmetry is established, and generative functional representations of joint and conditional probability distributions are obtained that are invariant or equivariant under the action of a compact group. Expand
Enhanced Convolutional Neural Tangent Kernels
TLDR
The resulting kernel, CNN-GP with LAP and horizontal flip data augmentation, achieves 89% accuracy, matching the performance of AlexNet, which is the best such result the authors know of for a classifier that is not a trained neural network. Expand
Data augmentation and image understanding
TLDR
This dissertation focuses on vision and images, and uses data augmentation as a particularly useful inductive bias, a more effective regularisation method for artificial neural networks, and as the framework to analyse and improve the invariance of vision models to perceptually plausible transformations. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 81 REFERENCES
Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation
TLDR
This work aligns image pairs within each class under the assumption that the spatial transformation between images belongs to a large class of diffeomorphisms, and learns a class-specific probabilistic generative models of the transformations in a Riemannian submanifold of the Lie group of diffEomorphisms. Expand
Deep Symmetry Networks
TLDR
Deep symmetry networks (symnets), a generalization of convnets that forms feature maps over arbitrary symmetry groups that uses kernel-based interpolation to tractably tie parameters and pool over symmetry spaces of any dimension are introduced. Expand
A Kernel Theory of Modern Data Augmentation
TLDR
This paper provides a general model of augmentation as a Markov process, and shows that kernels appear naturally with respect to this model, even when the authors do not employ kernel classification, and analyzes more directly the effect of Augmentation on kernel classifiers. Expand
Unsupervised Data Augmentation
TLDR
UDA has a small twist in that it makes use of harder and more realistic noise generated by state-of-the-art data augmentation methods, which leads to substantial improvements on six language tasks and three vision tasks even when the labeled set is extremely small. Expand
Harmonic Networks: Deep Translation and Rotation Equivariance
TLDR
H-Nets are presented, a CNN exhibiting equivariance to patch-wise translation and 360-rotation, and it is demonstrated that their layers are general enough to be used in conjunction with the latest architectures and techniques, such as deep supervision and batch normalization. Expand
A Bayesian Data Augmentation Approach for Learning Deep Models
TLDR
A novel Bayesian formulation to data augmentation is provided, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set, and this approach produces better classification results than similar GAN models. Expand
Exploiting Cyclic Symmetry in Convolutional Neural Networks
TLDR
This work introduces four operations which can be inserted into neural network models as layers, andWhich can be combined to make these models partially equivariant to rotations, and which enable parameter sharing across different orientations. Expand
Dataset Augmentation in Feature Space
TLDR
This paper adopts a simpler, domain-agnostic approach to dataset augmentation, and works in the space of context vectors generated by sequence-to-sequence models, demonstrating a technique that is effective for both static and sequential data. Expand
Improved Regularization of Convolutional Neural Networks with Cutout
TLDR
This paper shows that the simple regularization technique of randomly masking out square regions of input during training, which is called cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Expand
A Hessian Based Complexity Measure for Deep Networks
TLDR
A new measure for the complexity of the function generated by a deep network based on the integral of the norm of the tangent Hessian is developed, which shows that the oft-used heuristic of data augmentation imposes an implicit Hessian regularization during learning. Expand
...
1
2
3
4
5
...