Deep Visual Analogy-Making
@inproceedings{Reed2015DeepVA, title={Deep Visual Analogy-Making}, author={Scott E. Reed and Yi Zhang and Y. Zhang and Honglak Lee}, booktitle={NIPS}, year={2015} }
In addition to identifying the content within a single image, relating images and generating related images are critical tasks for image understanding. Recently, deep convolutional networks have yielded breakthroughs in predicting image labels, annotations and captions, but have only just begun to be used for generating high-quality images. In this paper we develop a novel deep network trained end-to-end to perform visual analogy making, which is the task of transforming a query image according…
Figures and Tables from this paper
236 Citations
Visual attribute transfer through deep image analogy
- Computer ScienceACM Trans. Graph.
- 2017
The technique finds semantically-meaningful dense correspondences between two input images by adapting the notion of "image analogy" with features extracted from a Deep Convolutional Neutral Network for matching, and is called deep image analogy.
Semantic Image Analogy with a Conditional Single-Image GAN
- Computer ScienceACM Multimedia
- 2020
This work proposes a novel method to model the patch-level correspondence between semantic layout and appearance of a single image by training a single-image GAN that takes semantic labels as conditional input.
Learning to detect visual relations
- Computer Science
- 2019
A weakly-supervised approach is proposed which, given pre-trained object detectors, enables us to learn relation detectors using image-level labels only, maintaining a performance close to fully- supervised models.
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
- Computer ScienceNIPS
- 2016
A novel approach that models future frames in a probabilistic manner is proposed, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively.
From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators
- Computer ScienceArXiv
- 2016
This network is a modified variational autoencoder that supports supervised training of single-image analogies and in-network evaluation of outputs with a structured similarity objective that captures pixel covariances.
Detecting Unseen Visual Relations Using Analogies
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work learns a representation of visual relations that combines individual embeddings for subject, object and predicate together with a visual phrase embedding that represents the relation triplet, and demonstrates the benefits of this approach on three challenging datasets.
Few-shot Visual Reasoning with Meta-analogical Contrastive Learning
- Computer ScienceNeurIPS
- 2020
This work meta-learns its analogical contrastive learning model over the same tasks with diverse attributes, and shows that it generalizes to the same visual reasoning problem with unseen attributes.
Leveraging structure in Computer Vision tasks for flexible Deep Learning models
- Computer Science
- 2020
This thesis argues that, in contrast to the usual black-box behavior of neural networks, leveraging more structured internal representations is a promising direction for tackling problems, and focuses on two forms of structure, compositional architectures and modularity.
Representation Learning by Learning to Count
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
This paper uses two image transformations in the context of counting: scaling and tiling to train a neural network with a contrastive loss that produces representations that perform on par or exceed the state of the art in transfer learning benchmarks.
Image Analogy with Gaussian Process
- Computer Science2018 IEEE International Conference on Big Data and Smart Computing (BigComp)
- 2018
This work proposes an image analogy method using a Gaussian process that performs significantly better than DNN in environments with small dataset size and proposes novel sampling methods that select salient instances from a given dataset.
References
SHOWING 1-10 OF 33 REFERENCES
Analogy-preserving Semantic Embedding for Visual Object Categorization
- Computer ScienceICML
- 2013
Analogy-preserving Semantic Embedding (ASE) is proposed to model analogies that reflect the relationships between multiple pairs of classes simultaneously, in the form "p is to q, as r is to s".
Deep Convolutional Inverse Graphics Network
- Computer ScienceNIPS
- 2015
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene…
Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines
- Computer ScienceNeural Computation
- 2010
A low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product, which allows efficient learning of transformations between larger image patches and demonstrates the learning of optimal filter pairs from various synthetic and real image sequences.
Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis
- Computer ScienceNIPS
- 2015
A novel recurrent convolutional encoder-decoder network that is trained end-to-end on the task of rendering rotated objects starting from a single image and allows the model to capture long-term dependencies along a sequence of transformations.
Image analogies
- ArtSIGGRAPH
- 2001
This paper describes a new framework for processing images by example, called “image analogies,” based on a simple multi-scale autoregression, inspired primarily by recent results in texture synthesis.
"Mental Rotation" by Optimizing Transforming Distance
- Computer ScienceArXiv
- 2014
A trained relational model actively transforms pairs of examples so that they are maximally similar in some feature space yet respect the learned transformational constraints, in order to facilitate a search over a learned space of transformations.
Caffe: Convolutional Architecture for Fast Feature Embedding
- Computer ScienceACM Multimedia
- 2014
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Modeling the joint density of two images under a variety of transformations
- Computer ScienceCVPR 2011
- 2011
The model is defined as a factored three-way Boltzmann machine, in which hidden variables collaborate to define the joint correlation matrix for image pairs, which makes it possible to efficiently match images that are the same according to a learned measure of similarity.
Transformation Properties of Learned Visual Representations
- MathematicsICLR
- 2015
It is demonstrated in a model of rotating NORB objects that employs a latent representation of the non-commutative 3D rotation group SO(3) that is equivalent to a combination of the elementary irreducible representations.
Learning to generate chairs with convolutional neural networks
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
This work trains a generative convolutional neural network which is able to generate images of objects given object type, viewpoint, and color and shows that the network can be used to find correspondences between different chairs from the dataset, outperforming existing approaches on this task.