• Corpus ID: 6470548

What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks

  title={What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks},
  author={Patrick W. Gallagher and Shuai Tang and Zhuowen Tu},
Top-down information plays a central role in human perception, but plays relatively little role in many current state-of-the-art deep networks, such as Convolutional Neural Networks (CNNs). This work seeks to explore a path by which top-down information can have a direct impact within current deep networks. We explore this path by learning and using "generators" corresponding to the network internal effects of three types of transformation (each a restriction of a general affine transformation… 

Figures and Tables from this paper

Controllable Top-down Feature Transformer

This work develops top-down feature transformer (TFT) that is able to account for the hidden layer transformation while maintaining the overall consistency across layers and shows that it can be adopted in other applications such as data augmentation and image style transfer.

Top-down Flow Transformer Networks

A comprehensive study on various datasets including MNIST, shapes, and natural images with both inner and inter datasets demonstrates the advantages of the proposed top-down flow transformer framework, which can be adopted in a variety of computer vision applications.

Growing Interpretable Part Graphs on ConvNets via Multi-Shot Learning

This paper proposes a learning strategy that embeds object-part concepts into a pre-trained convolutional neural network (CNN), in an attempt to 1) explore explicit semantics hidden in CNN units

Mining Interpretable AOG Representations From Convolutional Networks via Active Question Answering

A question-answering method that uses active human-computer communications to mine patterns from a pre-trained CNN, in order to incrementally explain more features in conv-layers, and achieves similar or better part-localization performance than fast-RCNN methods.

Multi-Shot Mining Semantic Part Concepts in CNNs

This paper proposes a new learning strategy that incrementally embeds new object-part concepts into a pre-trained convolutional neural network (CNN), in order to 1) explore explicit semantics for the



Deep neural networks are easily fooled: High confidence predictions for unrecognizable images

This work takes convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and finds images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class, and produces fooling images, which are then used to raise questions about the generality of DNN computer vision.

Intriguing properties of neural networks

It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

Generative Modeling of Convolutional Neural Networks

Experiments on the challenging ImageNet benchmark show that the proposedGenerative gradient pre-training consistently helps improve the performances of CNNs, and the proposed generative visualization method generates meaningful and varied samples of synthetic images from a large-scale deep CNN.

Understanding deep image representations by inverting them

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of

Inverting Visual Representations with Convolutional Networks

  • A. DosovitskiyT. Brox
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
This work proposes a new approach to study image representations by inverting them with an up-convolutional neural network, and applies this method to shallow representations (HOG, SIFT, LBP), as well as to deep networks.

Spatial Transformer Networks

This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.

Inverting Convolutional Networks with Convolutional Networks

This work proposes a new approach to study deep image representations by inverting them with an up-convolutional neural network, and application of this method to a deep network trained on ImageNet provides numerous insights into the properties of the feature representation.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

Unsupervised Learning of Image Transformations

A probabilistic model for learning rich, distributed representations of image transformations that develops domain specific motion features, in the form of fields of locally transformed edge filters, and can fantasize new transformations on previously unseen images.