Tracking translation invariance in CNNs

@article{Myburgh2021TrackingTI,
  title={Tracking translation invariance in CNNs},
  author={Johannes C. Myburgh and Coenraad Mouton and Marelie Hattingh Davel},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.05997}
}
Although Convolutional Neural Networks (CNNs) are widely used, their translation invariance (ability to deal with translated inputs) is still subject to some controversy. We explore this question using translation-sensitivity maps to quantify how sensitive a standard CNN is to a translated input. We propose the use of Cosine Similarity as sensitivity metric over Euclidean Distance, and discuss the importance of restricting the dimensionality of either of these metrics when comparing… Expand

Figures and Tables from this paper

Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks.
  • Bing Yang, Zhenxin Zhang, +4 authors Ai-Bing Zhang
  • Medicine
  • Systematic biology
  • 2021
TLDR
A convolutional neural network method (morphology-molecule network (MMNet) that integrates morphological and molecular data for species identification that worked better than four currently-available alternative methods when tested with 10 independent datasets representing varying genetic diversity from different taxa. Expand
Grounding inductive biases in natural images: invariance stems from variations in data
TLDR
It is found that scale and translation invariance was similar across residual networks and vision transformer models despite their markedly different architectural inductive biases, and that the main factors of variation in ImageNet mostly relate to appearance and are specific to each class. Expand

References

SHOWING 1-10 OF 16 REFERENCES
Quantifying Translation-Invariance in Convolutional Neural Networks
TLDR
This analysis identifies training data augmentation as the most important factor in obtaining translation-invariant representations of images using convolutional neural networks. Expand
Understanding image representations by measuring their equivariance and equivalence
  • Karel Lenc, A. Vedaldi
  • Computer Science, Mathematics
  • 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
Three key mathematical properties of representations: equivariance, invariance, and equivalence are investigated and applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved. Expand
Making Convolutional Networks Shift-Invariant Again
TLDR
This work demonstrates that anti-aliasing by low-pass filtering before downsampling, a classical signal processing technique has been undeservingly overlooked in modern deep networks, is compatible with existing architectural components, such as max-pooling and strided-convolution. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities. Expand
Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition
TLDR
The aim is to gain insight into different functions by directly comparing them on a fixed architecture for several common object recognition tasks, and empirical results show that a maximum pooling operation significantly outperforms subsampling operations. Expand
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual RecognitionExpand
Gradient-based learning applied to document recognition
TLDR
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
...
1
2
...