Learning Binary Residual Representations for Domain-specific Video Streaming

  title={Learning Binary Residual Representations for Domain-specific Video Streaming},
  author={Yi-Hsuan Tsai and Ming-Yu Liu and Deqing Sun and Ming-Hsuan Yang and Jan Kautz},
  booktitle={AAAI Conference on Artificial Intelligence},
We study domain-specific video streaming. Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission. Several popular video streaming services, such as the video game streaming services of GeForce Now and Twitch, fall in this category. While conventional video compression standards such as H.264 are commonly used for this task, we hypothesize that… 

Figures and Tables from this paper

Boosting Image and Video Compression via Learning Latent Residual Patterns

This paper focuses on utilizing the residual information, which is the difference between a compressed video and its corresponding original/uncompressed one, and proposes a fairly efficient way to transmit the residual with the compressed video in order to boost the quality of video compression.

DVC: An End-To-End Deep Video Compression Framework

This paper proposes the first end-to-end video compression deep model that jointly optimizes all the components for video compression, and shows that the proposed approach can outperform the widely used video coding standard H.264 in terms of PSNR and be even on par with the latest standard MS-SSIM.

Learning Patterns of Latent Residual for Improving Video Compression

This work focuses on transmitting the residual from the original video, i.e. difference between a compressed video and its corresponding original/uncompressed one, together with the compressed video during video transmission.

Improving Deep Video Compression by Resolution-adaptive Flow Coding

A new framework called Resolution-adaptive Flow Coding (RaFC) is proposed to effectively compress the flow maps globally and locally, in which multi-resolution representations are used for both the input flow maps and the output motion features of the MV encoder.

Deep Learning-Based Video Coding

In the hope of advocating the research of deep learning-based video coding, a case study of the developed prototype video codec, Deep Learning Video Coding (DLVC), which features two deep tools that are both based on convolutional neural network, namely CNN-based in-loop filter and CNN- based block adaptive resolution coding.

Spatial and Temporal Uncertainty based Context Aware Adaptive Video Compression Using Deep learning

This work proposes an adaptive video compression technique using deep learning to lessen size of data through using the power of the processors rather than increasing the storage and transmission capacities of the data.

Video Compression through Image Interpolation

This paper presents an alternative in an end-to-end deep learning codec that outperforms today's prevailing codecs, such as H.261, MPEG-4 Part 2, and performs on par with H.264.

A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding

Experimental results show that the proposed scheme outperforms state-of-the-art work in coding efficiency, while the computational complexity is acceptable after GPU acceleration.

Standard vs. Learning-based Codecs for Real Time Endoscopic Video Transmission

. We compare traditional encoding/decoding methods for real time video streaming, like H264/AVC and H265/HEVC, and deep learning based methods, that are expected to deliver higher video quality at

CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial Scalability

CAESR is an hybrid learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard that relies on conditional coding that learns the optimal mixture of the source and the upscaled BL image, enabling better performance than residual coding.



Building Dual-Domain Representations for Compression Artifacts Reduction

This work proposes a highly accurate approach to remove artifacts of JPEG-compressed images that jointly learns a very deep convolutional network in both DCT and pixel domains and shows large improvements over the state of the arts.

Learning to compress images and videos

This work presents an intuitive scheme for lossy color-image compression, using the color information from a few representative pixels to learn a model which predicts color on the rest of the pixels, and a simple active learning strategy to choose the representative pixels.

Context Encoders: Feature Learning by Inpainting

It is found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures, and can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.

Variable Rate Image Compression with Recurrent Neural Networks

A general framework for variable-rate image compression and a novel architecture based on convolutional and deconvolutional LSTM recurrent networks are proposed, which provide better visual quality than (headerless) JPEG, JPEG2000 and WebP, with a storage size reduced by 10% or more.

D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images

In this paper, we design a Deep Dual-Domain (D3) based fast restoration model to remove artifacts of JPEG compressed images. It leverages the large learning capacity of deep networks, as well as the

Neural Adaptive Video Streaming with Pensieve

P Pensieve is proposed, a system that generates ABR algorithms using reinforcement learning (RL), and outperforms the best state-of-the-art scheme, with improvements in average QoE of 12%--25%.

Real-Time Adaptive Image Compression

A machine learning-based approach to lossy image compression which outperforms all existing codecs, while running in real-time, and supplementing the approach with adversarial training specialized towards use in a compression setting.

Full Resolution Image Compression with Recurrent Neural Networks

This is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

This paper presents the first convolutional neural network capable of real-time SR of 1080p videos on a single K2 GPU and introduces an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output.

Overview of fine granularity scalability in MPEG-4 video standard

  • Weiping Li
  • Computer Science
    IEEE Trans. Circuits Syst. Video Technol.
  • 2001
An overview of the FGS video coding technique is provided in this Amendment of the MPEG-4 to address a variety of challenging problems in delivering video over the Internet.