SwinIR: Image Restoration Using Swin Transformer

@article{Liang2021SwinIRIR,
  title={SwinIR: Image Restoration Using Swin Transformer},
  author={Jingyun Liang and Jie Cao and Guolei Sun and K. Zhang and Luc Van Gool and Radu Timofte},
  journal={2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
  year={2021},
  pages={1833-1844}
}
  • Jingyun Liang, Jie Cao, R. Timofte
  • Published 23 August 2021
  • Computer Science
  • 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR… 

Figures and Tables from this paper

Restormer: Efficient Transformer for High-Resolution Image Restoration

TLDR
This work proposes an efficient Transformer model by making several key designs in the building blocks (multi-head attention and feed-forward network) such that it can capture long-range pixel interactions, while still remaining applicable to large images.

SUNet: Swin Transformer UNet for Image Denoising

TLDR
A restoration model called SUNet is proposed which uses the Swin Transformer layer as the authors' basic block and then is applied to UNet architecture for image denoising.

Residual Swin Transformer Channel Attention Network for Image Demosaicing

TLDR
Inspired by the success of SwinIR, a novel Swin Transformer-based network for image demosaicing, called RSTCANet is proposed, which outperforms state-of-the-art image dem Rosaicing methods, and has a smaller number of parameters.

HST: Hierarchical Swin Transformer for Compressed Image Super-resolution

TLDR
The Hierarchical Swin Transformer (HST) network is proposed to restore the low-resolution compressed image, which jointly captures the hierarchical feature representations and enhances each-scale representation with Swin transformer, respectively.

DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-Transformer

Low-Light Image Enhancement via Transformer-based Network

TLDR
The proposed residual learning method is compared with the current excellent LLE methods on the LOL dataset, and the experimental data showed that the method is state of the art in PSNR, SSIM, NIQE matrics.

ELMformer: Efficient Raw Image Restoration with a Locally Multiplicative Transformer

TLDR
An Efficient Locally Multiplicative Transformer called ELMformer, which contains two core designs especially for raw images whose primitive attribute is single-channel, achieves the highest performance and keeps the lowest FLOPs on raw denoising and raw deblurring benchmarks compared with state-of-the-arts.

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

TLDR
Extensive experiments on AGWN removal and real image denoising demonstrate that the new network architecture design achieves state-of-the-art performance and the new degradation model can help to improve the practicability.

Uformer: A General U-Shaped Transformer for Image Restoration

TLDR
Uformer is an effective andcient Transformer-based architecture for image restoration, in which a hierarchical encoder-decoder network is built using the Transformer block and a learnable multi-scale restoration modulator in the form of a multi- scale spatial bias to adjust features in multiple layers of the Uformer decoder is proposed.

DPFNet: A Dual-branch Dilated Network with Phase-aware Fourier Convolution for Low-light Image Enhancement

TLDR
This work proposes a novel module using the Fourier coefficients, which can recover high-quality texture details under the constraint of semantics in the frequency phase and supplement the spatial domain to alleviate the loss of detail caused by frequent downsampling.
...

References

SHOWING 1-10 OF 97 REFERENCES

CAS-CNN: A deep convolutional neural network for image compression artifact suppression

TLDR
This work presents a novel 12-layer deep convolutional network for image compression artifact suppression with hierarchical skip connections and a multi-scale loss function and shows that a network trained for a specific quality factor is resilient to the QF used to compress the input image.

Uformer: A General U-Shaped Transformer for Image Restoration

TLDR
Uformer is an effective andcient Transformer-based architecture for image restoration, in which a hierarchical encoder-decoder network is built using the Transformer block and a learnable multi-scale restoration modulator in the form of a multi- scale spatial bias to adjust features in multiple layers of the Uformer decoder is proposed.

Feedback Network for Image Super-Resolution

TLDR
An image super-resolution feedback network (SRFBN) is proposed to refine low-level representations with high-level information by using hidden states in a recurrent neural network (RNN) with constraints to achieve such feedback manner.

Multi-level Wavelet-CNN for Image Restoration

TLDR
This paper presents a novel multi-level wavelet CNN model for better tradeoff between receptive field size and computational efficiency, and shows the effectiveness of MWCNN for image denoising, single image super-resolution, and JPEG image artifacts removal.

Pre-Trained Image Processing Transformer

TLDR
To maximally excavate the capability of transformer, the IPT model is presented to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs and the contrastive learning is introduced for well adapting to different image processing tasks.

Plug-and-Play Image Restoration With Deep Denoiser Prior

TLDR
Experimental results demonstrate that the proposed plug-and-play image restoration with deep denoiser prior not only significantly outperforms other state-of-the-art model-based methods but also achieves competitive or even superior performance against state- of theart learning- based methods.

Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation

TLDR
Under the direct downsampling and up-sampling of the inputs and outputs by 4×, experiments demonstrate that the pure Transformer-based U-shaped Encoder-Decoder network outperforms those methods with full-convolution or the combination of transformer and convolution.

Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

TLDR
This paper proposes the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images and generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications.

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

TLDR
This work proposes a residual in residual (RIR) structure to form very deep network, which consists of several residual groups with long skip connections, and proposes a channel attention mechanism to adaptively rescale channel-wise features by considering interdependencies among channels.

Residual Dense Network for Image Restoration

TLDR
This work proposes residual dense block (RDB) to extract abundant local features via densely connected convolutional layers and proposes local feature fusion in RDB to adaptively learn more effective features from preceding and current local features and stabilize the training of wider network.
...