The U-Net based GLOW for Optical-Flow-Free Video Interframe Generation

@inproceedings{Park2022TheUB,
  title={The U-Net based GLOW for Optical-Flow-Free Video Interframe Generation},
  author={Saem Mul Park and Dong Yan Han and Nojun Kwak},
  booktitle={ICPRAM},
  year={2022}
}
Video frame interpolation is the task of creating an interframe between two adjacent frames along the time axis. So, instead of simply averaging two adjacent frames to create an intermediate image, this operation should maintain semantic continuity with the adjacent frames. Most conventional methods use optical flow, and various tools such as occlusion handling and object smoothing are indispensable. Since the use of these various tools leads to complex problems, we tried to tackle the video… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 19 REFERENCES
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
TLDR
This work proposes an end-to-end convolutional neural network for variable-length multi-frame video interpolation, where the motion interpretation and occlusion reasoning are jointly modeled.
PhaseNet for Video Frame Interpolation
TLDR
This work proposes a new approach, PhaseNet, that is designed to robustly handle challenging scenarios while also coping with larger motion, and shows that this is superior to the hand-crafted heuristics previously used in phase-based methods and compares favorably to recent deep learning based approaches for video frame interpolation on challenging datasets.
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
TLDR
The concept of end-to-end learning of optical flow is advanced and it work really well, and faster variants that allow optical flow computation at up to 140fps with accuracy matching the original FlowNet are presented.
IM-Net for High Resolution Video Frame Interpolation
TLDR
IM-Net is proposed: an interpolated motion neural network based on an economic structured architecture and end-to-end training with multi-scale tailored losses that outperforms previous methods on a high resolution version of the recently introduced Vimeo triplet dataset.
Video Frame Interpolation via Adaptive Convolution
TLDR
This paper presents a robust video frame interpolation method that considers pixel synthesis for the interpolated frame as local convolution over two input frames and employs a deep fully convolutional neural network to estimate a spatially-adaptive convolution kernel for each pixel.
MaskFlownet: Asymmetric Feature Matching With Learnable Occlusion Mask
TLDR
An asymmetric occlusions-aware feature matching module, which can learn a rough occlusion mask that filters useless areas immediately after feature warping without any explicit supervision, which surpasses all published optical flow methods on the MPI Sintel, KITTI 2012 and 2015 benchmarks.
A Database and Evaluation Methodology for Optical Flow
TLDR
This paper proposes a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms and analyzes the results obtained to date to draw a large number of conclusions.
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
TLDR
PWC-Net has been designed according to simple and well-established principles: pyramidal processing, warping, and the use of a cost volume, and outperforms all published optical flow methods on the MPI Sintel final pass and KITTI 2015 benchmarks.
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
  • N. Mayer, Eddy Ilg, T. Brox
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This paper proposes three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks and presents a convolutional network for real-time disparity estimation that provides state-of-the-art results.
i-RevNet: Deep Invertible Networks
TLDR
The i-RevNet is built, a network that can be fully inverted up to the final projection onto the classes, i.e. no information is discarded, and linear interpolations between natural image representations are reconstructed.
...
...