Deep Animation Video Interpolation in the Wild

  title={Deep Animation Video Interpolation in the Wild},
  author={Siyao Li and Shiyu Zhao and Weijiang Yu and Wenxiu Sun and Dimitris N. Metaxas and Chen Change Loy and Ziwei Liu},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Siyao LiShiyu Zhao Ziwei Liu
  • Published 6 April 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming. Therefore, it is desirable to develop computational models that can automatically interpolate the in-between animation frames. However, existing video interpolation methods fail to produce satisfying results on animation data. Compared to natural videos, animation videos possess two unique characteristics that make frame interpolation difficult: 1… 

Figures and Tables from this paper

Enhanced Deep Animation Video Interpolation

This work presents AutoFI, a simple and effective method to automati-cally render training data for deep animation video interpolation, and proposes a plug-and-play sketch-based post-processing module, named SktFI, to help improve frame interpolation algorithms from nature video to animation video.

Beyond Natural Motion: Exploring Discontinuity for Video Frame Interpolation

A novel data augmentation strategy called figure-text mixing (FTM) is proposed to make the model learn more general scenarios and outperforms the state-ofthe-art methods for natural video datasets containing only continuous motions.

Improving the Perceptual Quality of 2D Animation Interpolation

This work proposes SoftsplatLite (SSL), a forward-warping interpolation architecture with fewer trainable parameters and better perceptual performance, and establishes that the LPIPS perceptual metric and chamfer line distance are more appropriate measures of quality than PSNR and SSIM used in prior art.

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

The proposed architecture makes use of 3D space-time convolutions to enable end to end learning and inference for the task of video frame interpolation and can serve as a useful self-supervised pretext task for action recognition, optical flow estimation, and motion magnification.

AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos

The proposed method, AnimeSR, is capable of restoring real-world low-quality animation videos effectively and efficiently, achieving superior performance to previous state-of-the-art methods.

DRVI: Dual Refinement for Video Interpolation

A dual refinement technique for video interpolation (DRVI) has three main steps, namely flow refinement, frame synthesis, and Haar refinement, which can generate accurate bi-directional flows, which are more suitable for frame interpolation tasks.

SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches

This work proposes a problem formulation that more closely adheres to the standard workflow of animation, and demonstrates a model, SketchBetween, which learns to map between keyframes and sketched in-betweens to rendered sprite animations.

Many-to-many Splatting for Efficient Video Frame Interpolation

This work proposes a fully differentiable Many-to-Many (M2M) splatting framework to interpolate frames efficiently and finds that it significantly improves the efficiency while maintaining high effectiveness.

Splatting-based Synthesis for Video Frame Interpolation

A deep learning approach that solely relies on splatting to synthesize interpolated frames for video frame interpolation is proposed, which is not only much faster than similar approaches, especially for multi-frame interpolation, but can also yield new state-of-the-art results at high resolutions.

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

This work develops a texture consistency loss (TCL) upon the assumption that the interpolated content should maintain similar structures with their counterparts in the given frames, and shows that high-quality interpolated frames are also beneficial to the development of the video super-resolution task.



Quadratic video interpolation

This work proposes a quadratic video interpolation method which exploits the acceleration information in videos, allows prediction with curvilinear trajectory and variable velocity, and generates more accurate interpolation results.

FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation

This work devised a novel structure-to-texture generation framework which splits the video interpolation task into two stages: structure-guided interpolation and texture refinement, and is the first work that attempts to directly generate the intermediate frame through blending deep features.

Video Frame Synthesis Using Deep Voxel Flow

This work addresses the problem of synthesizing new video frames in an existing video, either in-between existing frames (interpolation), or subsequent to them (extrapolation), by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which is called deep voxel flow.

Depth-Aware Video Frame Interpolation

A video frame interpolation method which explicitly detects the occlusion by exploring the depth information, and develops a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones.

AnimeGAN: A Novel Lightweight GAN for Photo Animation

Experimental results show that the proposed novel lightweight generative adversarial network, called AnimeGAN, can rapidly transform real-world photos into high-quality anime images and outperforms state-of-the-art methods.

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

This work proposes an end-to-end convolutional neural network for variable-length multi-frame video interpolation, where the motion interpretation and occlusion reasoning are jointly modeled.

All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling

Results on the Adobe240 dataset show that the proposed method generates visually pleasing, temporally consistent frames, and outperforms the current best off-the-shelf method by 1.57db in PSNR with 8 times smaller model and 7.7 times faster.

Phase-based frame interpolation for video

A novel, bounded phase shift correction method that combines phase information across the levels of a multi-scale pyramid is introduced that allows in-between images to be generated by simple per-pixel phase modification, without the need for any form of explicit correspondence estimation.

Video Frame Interpolation via Adaptive Convolution

This paper presents a robust video frame interpolation method that considers pixel synthesis for the interpolated frame as local convolution over two input frames and employs a deep fully convolutional neural network to estimate a spatially-adaptive convolution kernel for each pixel.

Context-Aware Synthesis for Video Frame Interpolation

  • Simon NiklausFeng Liu
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
A context-aware synthesis approach that warps not only the input frames but also their pixel-wise contextual information and uses them to interpolate a high-quality intermediate frame and outperforms representative state-of-the-art approaches.