Progressive Training of A Two-Stage Framework for Video Restoration

  title={Progressive Training of A Two-Stage Framework for Video Restoration},
  author={Mei Zheng and Qunliang Xing and Minglang Qiao and Mai Xu and Lai Jiang and Huaida Liu and Ying Chen},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  • Mei Zheng, Qunliang Xing, Ying Chen
  • Published 21 April 2022
  • Computer Science
  • 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to… 

Figures and Tables from this paper

HST: Hierarchical Swin Transformer for Compressed Image Super-resolution

The Hierarchical Swin Transformer (HST) network is proposed to restore the low-resolution compressed image, which jointly captures the hierarchical feature representations and enhances each-scale representation with Swin transformer, respectively.

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

The AdamW optimizer with the learning rate of 2 × 10 − 4 learning rate to train the model for 1,000,000 iterations and theLearning rate is decayed with the cosine strategy, Weight decay is 10 −4 for all the training periodic.



VRT: A Video Restoration Transformer

Experimental results on video super-resolution, video deblurring, video denoising, video frame interpolation and space-time videosuper-resolution demonstrate that VRT outperforms the state-of-the-art methods by large margins.

MFQE 2.0: A New Approach for Multi-Frame Quality Enhancement on Compressed Video

In this paper, a Bidirectional Long Short-Term Memory based detector is developed and a novel Multi-Frame Convolutional Neural Network is designed to enhance the quality of compressed video, in which the non-PQF and its nearest two PQFs are the input.

Multi-frame Quality Enhancement for Compressed Video

This paper investigates that heavy quality fluctuation exists across compressed video frames, and thus low quality frames can be enhanced using the neighboring high quality frames, seen as Multi-Frame Quality Enhancement (MFQE).

EDVR: Video Restoration With Enhanced Deformable Convolutional Networks

This work proposes a novel Video Restoration framework with Enhanced Deformable convolutions, termed EDVR, and proposes a Temporal and Spatial Attention (TSA) fusion module, in which attention is applied both temporally and spatially, so as to emphasize important features for subsequent restoration.

Video Super-Resolution Transformer

This paper presents a spatial-temporal convolutional self-attention layer with a theoretical understanding to exploit the locality information and designs a bidirectional optical flow-based feed-forward layer to discover the correlations across different video frames and also align features.

Restormer: Efficient Transformer for High-Resolution Image Restoration

This work proposes an efficient Transformer model by making several key designs in the building blocks (multi-head attention and feed-forward network) such that it can capture long-range pixel interactions, while still remaining applicable to large images.

ViViT: A Video Vision Transformer

This work shows how to effectively regularise the model during training and leverage pretrained image models to be able to train on comparatively small datasets, and achieves state-of-the-art results on multiple video classification benchmarks.

Fast Online Video Super-Resolution with Deformable Attention Pyramid

This work proposes a recurrent VSR architecture based on a deformable attention pyramid (DAP), which aligns and integrates information from the recurrent state into the current frame prediction and significantly reduces processing time in comparison to state-of-the-art methods, while maintaining a high performance.

Spatio-Temporal Deformable Convolution for Compressed Video Quality Enhancement

This paper proposes a fast yet effective method for compressed video quality enhancement by incorporating a novel Spatio-Temporal Deformable Fusion (STDF) scheme to aggregate temporal information and achieves the state-of-the-art performance of compressed videoquality enhancement in terms of both accuracy and efficiency.

Decoder-side HEVC quality enhancement with scalable convolutional neural network

A Decoder-side Scalable Convolutional Neural Network (DS-CNN) approach to achieve quality enhancement for HEVC, which does not require any modification of the encoder and is different from the existing CNN-based quality enhancement approaches.