Corpus ID: 203951421

Unaligned Image-to-Sequence Transformation with Loop Consistency

@article{Wang2019UnalignedIT,
  title={Unaligned Image-to-Sequence Transformation with Loop Consistency},
  author={Siyang Wang and Justin Lazarow and Kwonjoon Lee and Zhuowen Tu},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.04149}
}
We tackle the problem of modeling sequential visual phenomena. Given examples of a phenomena that can be divided into discrete time steps, we aim to take an input from any such time and realize this input at all other time steps in the sequence. Furthermore, we aim to do this without ground-truth aligned sequences -- avoiding the difficulties needed for gathering aligned data. This generalizes the unpaired image-to-image problem from generating pairs to generating sequences. We extend cycle… Expand

Figures, Tables, and Topics from this paper

References

SHOWING 1-10 OF 30 REFERENCES
ComboGAN: Unrestrained Scalability for Image Domain Translation
TLDR
This paper proposes a multi-component image translation model and training scheme which scales linearly - both in resource consumption and time required - with the number of domains and demonstrates its capabilities on a dataset of paintings by 14 different artists. Expand
Image-to-Image Translation with Conditional Adversarial Networks
TLDR
Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Expand
Unsupervised Image-to-Image Translation Networks
TLDR
This work makes a shared-latent space assumption and proposes an unsupervised image-to-image translation framework based on Coupled GANs that achieves state-of-the-art performance on benchmark datasets. Expand
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
TLDR
This work considers image transformation problems, and proposes the use of perceptual loss functions for training feed-forward networks for image transformation tasks, and shows results on image style transfer, where aFeed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Expand
Generating Videos with Scene Dynamics
TLDR
A generative adversarial network for video with a spatio-temporal convolutional architecture that untangles the scene's foreground from the background is proposed that can generate tiny videos up to a second at full frame rate better than simple baselines. Expand
Multimodal Unsupervised Image-to-Image Translation
TLDR
A Multimodal Unsupervised Image-to-image Translation (MUNIT) framework that assumes that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. Expand
StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation
TLDR
A unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network, which leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. Expand
Generating the Future with Adversarial Transformers
TLDR
This work presents a model that generates the future by transforming pixels in the past, and explicitly disentangles the models memory from the prediction, which helps the model learn desirable invariances. Expand
Non-local Neural Networks
TLDR
This paper presents non-local operations as a generic family of building blocks for capturing long-range dependencies in computer vision and improves object detection/segmentation and pose estimation on the COCO suite of tasks. Expand
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization
TLDR
This paper presents a simple yet effective approach that for the first time enables arbitrary style transfer in real-time, comparable to the fastest existing approach, without the restriction to a pre-defined set of styles. Expand
...
1
2
3
...