Corpus ID: 236428590

Self-Conditioned Probabilistic Learning of Video Rescaling

  title={Self-Conditioned Probabilistic Learning of Video Rescaling},
  author={Yuan Tian and Guo Lu and Xiongkuo Min and Zhaohui Che and Guangtao Zhai and Guodong Guo and Zhiyong Gao},
  • Yuan Tian, Guo Lu, +4 authors Zhiyong Gao
  • Published 24 July 2021
  • Computer Science
  • ArXiv
Bicubic downscaling is a prevalent technique used to reduce the video storage burden or to accelerate the downstream processing speed. However, the inverse upscaling step is non-trivial, and the downscaled video may also deteriorate the performance of downstream tasks. In this paper, we propose a self-conditioned probabilistic framework for video rescaling to learn the paired downscaling and upscaling procedures simultaneously. During the training, we decrease the entropy of the information… Expand
1 Citations
Deep Learning for Visual Data Compression
  • Guo Lu, Ren Yang, Shenlong Wang, Shan Liu, R. Timofte
  • Computer Science
  • ACM Multimedia
  • 2021
The recent progress in deep learning based visual data compression, including image compression, video compression and point cloud compression is introduced. Expand


Task-Aware Image Downscaling
This paper proposes an auto-encoder-based framework that enables joint learning of the downscaling network and the upsc scaling network to maximize the restoration performance and validates the model’s generalization capability by applying it to the task of image colorization. Expand
Scale-Space Flow for End-to-End Optimized Video Compression
This paper proposes scale-space flow, an intuitive generalization of optical flow that adds a scale parameter to allow the network to better model uncertainty and outperform analogous state-of-the art learned video compression models while being trained using a much simpler procedure and without any pre-trained optical flow networks. Expand
An End-to-End Learning Framework for Video Compression
This paper proposes the first end-to-end deep video compression framework that can outperform the widely used video coding standard H.264 and be even on par with the latest standard H265. Expand
Neural Inter-Frame Compression for Video Coding
This work presents an inter-frame compression approach for neural video coding that can seamlessly build up on different existing neural image codecs and proposes to compute residuals directly in latent space instead of in pixel space to reuse the same image compression network for both key frames and intermediate frames. Expand
DVC: An End-To-End Deep Video Compression Framework
This paper proposes the first end-to-end video compression deep model that jointly optimizes all the components for video compression, and shows that the proposed approach can outperform the widely used video coding standard H.264 in terms of PSNR and be even on par with the latest standard MS-SSIM. Expand
Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation
A novel end-to-end deep neural network that generates dynamic upsampling filters and a residual image, which are computed depending on the local spatio-temporal neighborhood of each pixel to avoid explicit motion compensation is proposed. Expand
FVC: A New Framework towards Deep Video Compression in Feature Space
  • Zhihao Hu, Guo Lu, Dong Xu
  • Engineering, Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
This work proposes a feature-space video coding network (FVC) by performing all major operations (i.e., motion estimation, motion compression, motion compensation and residual compression) in the feature space by using the auto-encoder style network. Expand
Deep Non-Local Kalman Network for Video Compression Artifact Reduction
This work proposed a deep non-local Kalman network for compression artifact reduction and the video restoration is modeled as a Kalman filtering procedure and the decoded frames can be restored from the proposed deep Kalman model. Expand
Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations
This study proposes a novel progressive fusion network for video SR, which is designed to make better use of spatio-temporal information and is proved to be more efficient and effective than the existing direct fusion, slow fusion or 3D convolution strategies. Expand
M-LVC: Multiple Frames Prediction for Learned Video Compression
An end-to-end learned video compression scheme for low-latency scenarios that introduces the usage of the previous multiple frames as references and designs a MV refinement network and a residual refinement network, taking use of the multiple reference frames as well. Expand