Corpus ID: 236428203

Towards Generative Video Compression

@article{Mentzer2021TowardsGV,
  title={Towards Generative Video Compression},
  author={Fabian Mentzer and E. Agustsson and Johannes Ball'e and David C. Minnen and Nick Johnston and G. Toderici},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.12038}
}
We present a neural video compression method based on generative adversarial networks (GANs) that outperforms previous neural video compression methods and is comparable to HEVC in a user study. We propose a technique to mitigate temporal error accumulation caused by recursive frame compression that uses randomized shifting and un-shifting, motivated by a spectral analysis. We present in detail the network design choices, their relative importance, and elaborate on the challenges of evaluating… Expand

References

SHOWING 1-10 OF 36 REFERENCES
Generative Adversarial Networks for Extreme Learned Image Compression
TLDR
If a semantic label map of the original image is available, the learned image compression system can fully synthesize unimportant regions in the decoded image such as streets and trees from the label map, proportionally reducing the storage cost. Expand
Feedback Recurrent Autoencoder for Video Compression
TLDR
This work proposes a new network architecture, based on common and well studied components, for learned video compression operating in low latency mode, and yields state of the art MS-SSIM/rate performance on the high-resolution UVG dataset. Expand
Video Compression through Image Interpolation
TLDR
This paper presents an alternative in an end-to-end deep learning codec that outperforms today's prevailing codecs, such as H.261, MPEG-4 Part 2, and performs on par with H.264. Expand
Video Compression With Rate-Distortion Autoencoders
TLDR
A deep generative model for lossy video compression is presented that outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation and opens up novel video compression applications, which have not been feasible with classical codecs. Expand
Neural Inter-Frame Compression for Video Coding
TLDR
This work presents an inter-frame compression approach for neural video coding that can seamlessly build up on different existing neural image codecs and proposes to compute residuals directly in latent space instead of in pixel space to reuse the same image compression network for both key frames and intermediate frames. Expand
DVC: An End-To-End Deep Video Compression Framework
TLDR
This paper proposes the first end-to-end video compression deep model that jointly optimizes all the components for video compression, and shows that the proposed approach can outperform the widely used video coding standard H.264 in terms of PSNR and be even on par with the latest standard MS-SSIM. Expand
Video-to-Video Synthesis
TLDR
This paper proposes a novel video-to-video synthesis approach under the generative adversarial learning framework, capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis. Expand
Scale-Space Flow for End-to-End Optimized Video Compression
TLDR
This paper proposes scale-space flow, an intuitive generalization of optical flow that adds a scale parameter to allow the network to better model uncertainty and outperform analogous state-of-the art learned video compression models while being trained using a much simpler procedure and without any pre-trained optical flow networks. Expand
Deep Generative Models for Distribution-Preserving Lossy Compression
TLDR
This work proposes and studies the problem of distribution-preserving lossy compression to optimize the rate-distortion tradeoff under the constraint that the reconstructed samples follow the distribution of the training data, and recovers both ends of the spectrum. Expand
Conditional Entropy Coding for Efficient Video Compression
TLDR
A very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames that outperforms H.265 and other deep learning baselines in MS-SSIM on higher bitrate UVG video and against all video codecs on lower framerates. Expand
...
1
2
3
4
...