• Corpus ID: 6099034

Spatial Transformer Networks

@inproceedings{Jaderberg2015SpatialTN,
  title={Spatial Transformer Networks},
  author={Max Jaderberg and Karen Simonyan and Andrew Zisserman and Koray Kavukcuoglu},
  booktitle={NIPS},
  year={2015}
}
Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. [] Key Method This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the…

Figures and Tables from this paper

Spatial Transformations in Deep Neural Networks

  • Michał BednarekK. Walas
  • Computer Science
    2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
  • 2018
This paper introduces the end-to-end system that is able to learn spatial invariance including in-plane and out-of-plane rotations and shows that it can successfully improve the classification score by implementing so-called Spatial Transformer module.

Deep Diffeomorphic Transformer Networks

This work investigates the use of flexible diffeomorphic image transformations within neural networks and demonstrates that significant performance gains can be attained over currently-used models.

A Refined Spatial Transformer Network

Experimental results reveal that the design of a module to estimate the difference between the ground truth and STN output outperforms the state-of-the-art methods in the cluttered MNIST handwritten digits classification task and planar image alignment task.

Studying Invariances of Trained Convolutional Neural Networks

A new learnable module, the Invariant Transformer Net, is introduced, which enables us to learn differentiable parameters for a set of affine transformations, which allows us to extract the space of transformations to which the CNN is invariant and its class prediction robust.

Volumetric Transformer Networks

This work proposes a loss function defined between the warped features of pairs of instances, which improves the localization ability of VTN and consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.

SPATIAL TRANSFORMATIONS

This work presents a construction that is simple and exact, yet has the same computational complexity that standard convolutions enjoy, consisting of a constant image warp followed by a simple convolution, which are standard blocks in deep learning toolboxes.

DeSTNet: Densely Fused Spatial Transformer Networks

This paper proposes Densely Fused Spatial Transformer Network (DeSTNet), which, to the best knowledge, is the first dense fusion pattern for combining multiple STNs, and shows how changing the connectivity pattern of multipleSTNs from sequential to dense leads to more powerful alignment modules.

DeSTNet : Densely Fused Spatial Transformer Networks 1

This paper proposes Densely Fused Spatial Transformer Network (DeSTNet), which, to the best knowledge, is the first dense fusion pattern for combining multiple STNs, and shows how changing the connectivity pattern of multipleSTNs from sequential to dense leads to more powerful alignment modules.

Exploiting Cyclic Symmetry in Convolutional Neural Networks

This work introduces four operations which can be inserted into neural network models as layers, andWhich can be combined to make these models partially equivariant to rotations, and which enable parameter sharing across different orientations.

Warped Convolutions: Efficient Invariance to Spatial Transformations

This work presents a construction that is simple and exact, yet has the same computational complexity that standard convolutions enjoy, consisting of a constant image warp followed by a simple convolution, which are standard blocks in deep learning toolboxes.
...

References

SHOWING 1-10 OF 43 REFERENCES

Transforming Auto-Encoders

It is argued that neural networks can be used to learn features that output a whole vector of instantiation parameters and this is a much more promising way of dealing with variations in position, orientation, scale and lighting than the methods currently employed in the neural networks community.

Deep Symmetry Networks

Deep symmetry networks (symnets), a generalization of convnets that forms feature maps over arbitrary symmetry groups that uses kernel-based interpolation to tractably tie parameters and pool over symmetry spaces of any dimension are introduced.

Locally Scale-Invariant Convolutional Neural Networks

A simple model is presented that allows ConvNets to learn features in a locally scale-invariant manner without increasing the number of model parameters, and is shown on a modified MNIST dataset that when faced with scale variation, building in scale-Invariance allows Conv net to learn more discriminative features with reduced chances of over-fitting.

Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks

An approach is presented that is able to learn part models in a completely unsupervised manner, without part annotations and even without given bounding boxes during learning, to find constellations of neural activation patterns computed using convolutional neural networks.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Deep Networks with Internal Selective Attention through Feedback Connections

DasNet harnesses the power of sequential processing to improve classification performance, by allowing the network to iteratively focus its internal attention on some of its convolutional filters.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Fully convolutional networks for semantic segmentation

The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

Efficient and accurate approximations of nonlinear convolutional networks

This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs), and takes the nonlinear units into account, subject to a low-rank constraint which helps to reduce the complexity of filters.

Bilinear CNN Models for Fine-Grained Visual Recognition

We propose bilinear models, a recognition architecture that consists of two feature extractors whose outputs are multiplied using outer product at each location of the image and pooled to obtain an