Video Frame Synthesis Using Deep Voxel Flow

@article{Liu2017VideoFS,
  title={Video Frame Synthesis Using Deep Voxel Flow},
  author={Ziwei Liu and Raymond A. Yeh and Xiaoou Tang and Yiming Liu and Aseem Agarwala},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={4473-4481}
}
We address the problem of synthesizing new video frames in an existing video, either in-between existing frames (interpolation), or subsequent to them (extrapolation. [...] Key Method We combine the advantages of these two methods by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which we call deep voxel flow. Our method requires no human supervision, and any video can be used as training data by dropping, and then learning to predict, existing frames…Expand
Video Frame Interpolation Using Recurrent Convolutional Layers
TLDR
This paper proposes a novel frame interpolation method based on a video synthesis approach deep voxel flow, called DVF-RCL, which greatly improves the performance of original DVF and produce results that compare favorably to state-of-the-art methods both quantitatively and qualitatively. Expand
An Analytical Study of CNN-based Video Frame Interpolation Techniques
TLDR
How some deep convolution networks based methods have evolved over the years to improve the quality of the synthesized frames both qualitatively and quantitatively for video frame interpolation task is discussed. Expand
Video Frame Synthesis via Plug-and-Play Deep Locally Temporal Embedding
TLDR
It is demonstrated that the proposed generative framework, which is powered by a deep CNN and can be used instantly like conventional models, outperforms existing state-of-the-art models in terms of the perceptual quality. Expand
High-quality Frame Interpolation via Tridirectional Inference
TLDR
This work proposes a frame interpolation method by utilizing tridirectional information obtained from three input frames to learn rich and reliable inter-frame motion representations, including subtle nonlinear movement, which can be easily trained via any video frames in a self-supervised manner. Expand
Depth-Aware Video Frame Interpolation
TLDR
A video frame interpolation method which explicitly detects the occlusion by exploring the depth information, and develops a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones. Expand
Video Frame Interpolation Via Residue Refinement
  • Haopeng Li, Yuan Yuan, Qi Wang
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
A novel network structure that leverages residue refinement and adaptive weight to synthesize in-between frames and the adaptive weight map combines the forward and backward warped frames to reduce the artifacts is proposed. Expand
Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow
TLDR
This work uses a convolutional neural network that takes two frames as input and predicts two optical flows with pixelwise weights that outperforms the publicly available state-of-the-art methods on multiple datasets. Expand
PhaseNet for Video Frame Interpolation
TLDR
This work proposes a new approach, PhaseNet, that is designed to robustly handle challenging scenarios while also coping with larger motion, and shows that this is superior to the hand-crafted heuristics previously used in phase-based methods and compares favorably to recent deep learning based approaches for video frame interpolation on challenging datasets. Expand
Learning Spatial Transform for Video Frame Interpolation
TLDR
This work redefine the task as finding the spatial transform between adjacent frames and proposes a new neural network architecture that combines the two abovementioned approaches, namely Adaptive Deformable Convolution and Adaptive deformable convolution. Expand
Self-Reproducing Video Frame Interpolation
TLDR
This paper introduces a novel self-reproducing mechanism, that the real (given) frames could in turn be interpolated from the interpolated ones, to further substantially improve the consistency and performance of video frame interpolation. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
TLDR
A novel approach that models future frames in a probabilistic manner is proposed, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively. Expand
View Synthesis by Appearance Flow
TLDR
This work addresses the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints and shows that for both objects and scenes, this approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques. Expand
Deep multi-scale video prediction beyond mean square error
TLDR
This work trains a convolutional network to generate future frames given an input sequence and proposes three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. Expand
Phase-based frame interpolation for video
TLDR
A novel, bounded phase shift correction method that combines phase information across the levels of a multi-scale pyramid is introduced that allows in-between images to be generated by simple per-pixel phase modification, without the need for any form of explicit correspondence estimation. Expand
Learning Image Matching by Simply Watching Video
TLDR
An unsupervised learning based approach to the ubiquitous computer vision problem of image matching that achieves surprising performance comparable to traditional empirically designed methods. Expand
Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks
TLDR
This paper proposes to use deep neural networks to automatically convert 2D videos and images to a stereoscopic 3D format and shows that Deep3D outperforms baselines in both quantitative and human subject evaluations. Expand
Generating Videos with Scene Dynamics
TLDR
A generative adversarial network for video with a spatio-temporal convolutional architecture that untangles the scene's foreground from the background is proposed that can generate tiny videos up to a second at full frame rate better than simple baselines. Expand
Two-Stream Convolutional Networks for Action Recognition in Videos
TLDR
This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Expand
Deep Stereo: Learning to Predict New Views from the World's Imagery
TLDR
This work presents a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets, and is the first to apply deep learning to the problem ofnew view synthesis from sets of real-world, natural imagery. Expand
A Database and Evaluation Methodology for Optical Flow
TLDR
This paper proposes a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms and analyzes the results obtained to date to draw a large number of conclusions. Expand
...
1
2
3
4
5
...