Corpus ID: 14448399

Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet

  title={Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet},
  author={Jianwen Xie and Song-Chun Zhu and Ying Nian Wu},
Dynamic textures are spatial-temporal processes that exhibit statistical stationarity or stochastic repetitiveness in the temporal dimension. In this paper, we study the problem of modeling and synthesizing dynamic textures using a generative version of the convolution neural network (ConvNet or CNN) that consists of multiple layers of spatial-temporal filters to capture the spatial-temporal patterns in the dynamic textures. We show that such spatial-temporal generative ConvNet can synthesize… Expand
Synthesising Dynamic Textures using Convolutional Neural Networks
The model is based on spatiotemporal summary statistics computed from the feature representations of a Convolutional Neural Network trained on object recognition and can be used to synthesise new samples of dynamic textures and to predict motion in simple movies. Expand
Two-Stream Convolutional Networks for Dynamic Texture Synthesis
A two-stream model for dynamic texture synthesis based on pre-trained convolutional networks that target two independent tasks: object recognition, and optical flow prediction that generates novel, high quality samples that match both the framewise appearance and temporal evolution of input texture. Expand
Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
The probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image, and to synthesize realistic movement of objects, a novel network structure is proposed, namely a Cross Convolutional Network. Expand
Motion-based analysis and synthesis of dynamic textures
The algorithm first employs a perspective motion model to compensate for the global camera motion, such that only remaining dynamic motion is further analysed and uses dense optical flow instead of block-based approach, which is important for representing the true motion of rigid objects and modeling the random motion of dynamic textures. Expand
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
A novel approach that models future frames in a probabilistic manner is proposed, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively. Expand
DynamoNet: Dynamic Action and Motion Network
A novel unified spatio-temporal 3D-CNN architecture (DynamoNet) that jointly optimizes the video classification and learning motion representation by predicting future frames as a multi-task learning problem is introduced. Expand
Video Imagination from a Single Image with Transformation Generation
A new framework that produce imaginary videos by transformation generation is proposed, trained in an adversarial way with unsupervised learning and achieves promising performance in image quality assessment. Expand
Exploiting visual motion to understand our visual world
This dissertation will discuss how to infer physical properties of the visual world from observed 2D movement and relate the slight wiggling motion due to refraction to the movement of hot air and infer the location and velocity of the airflow. Expand
Semi-Supervised Detection of Extreme Weather Events in Large Climate Datasets
The approach is able to leverage temporal information and unlabelled data to improve localization of extreme weather events and explore the representations learned by the model in order to better understand this important data, and facilitate further work in understanding and mitigating the effects of climate change. Expand
ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events
This work presents a multichannel spatiotemporal CNN architecture for semi-supervised bounding box prediction and exploratory data analysis, and demonstrates that this approach is able to leverage temporal information and unlabeled data to improve the localization of extreme weather events. Expand


Dynamic Textures
A characterization of dynamic textures that poses the problems of modeling, learning, recognizing and synthesizing dynamic textures on a firm analytical footing and experimental evidence that, within the framework, even low-dimensional models can capture very complex visual phenomena is presented. Expand
Generating Videos with Scene Dynamics
A generative adversarial network for video with a spatio-temporal convolutional architecture that untangles the scene's foreground from the background is proposed that can generate tiny videos up to a second at full frame rate better than simple baselines. Expand
A Generative Method for Textured Motion: Analysis and Synthesis
A generative method that combines statistical models and algorithms from both texture and motion analysis and an EM-like stochastic gradient algorithm for inference of the hidden variables: bases, movetons, birth/death maps, parameters of the dynamics. Expand
Kernel Learning for Dynamic Texture Synthesis
To capture the nonlinearity of training frames, traditional PCR is extended to its kernelized version and kernel principal component regression (KPCR) is introduced to model and synthesize DTs, which makes KADTS ideally suited for real-world applications. Expand
Analysis and synthesis of textured motion: particles and waves
A generative model for representing these motion patterns and a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences are presented and the controllability of the model is demonstrated. Expand
Learning FRAME Models Using CNN Filters
This conceptual paper proposes to learn the generative FRAME model using the highly expressive filters pre-learned by the CNN at the convolutional layers, and shows that the learning algorithm can generate realistic and rich object and texture patterns in natural scenes. Expand
Deep Convolutional Inverse Graphics Network
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional sceneExpand
DRAW: A Recurrent Neural Network For Image Generation
The Deep Recurrent Attentive Writer neural network architecture for image generation substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye. Expand
A Theory of Generative ConvNet
It is shown that a generative random field model can be derived from the commonly used discriminative ConvNet, by assuming a ConvNet for multicategory classification and assuming one of the categories is a base category generated by a reference distribution. Expand
A Neural Algorithm of Artistic Style
This work introduces an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality and offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery. Expand