Perceptual Learned Source-Channel Coding for High-Fidelity Image Semantic Transmission

  title={Perceptual Learned Source-Channel Coding for High-Fidelity Image Semantic Transmission},
  author={Jun Wang and Sixian Wang and Jincheng Dai and Zhongwei Si and Dekun Zhou and Kai Niu},
—As one novel approach to realize end-to-end wireless image semantic transmission, deep learning-based joint source- channel coding (deep JSCC) method is emerging in both deep learning and communication communities. However, current deep JSCC image transmission systems are typically optimized for traditional distortion metrics such as peak signal-to-noise ratio (PSNR) or multi-scale structural similarity (MS-SSIM). But for low transmission rates, due to the imperfect wireless channel, these… 

Figures from this paper


The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
A new dataset of human perceptual similarity judgments is introduced and it is found that deep features outperform all previous metrics by large margins on this dataset, and suggests that perceptual similarity is an emergent property shared across deep visual representations.
BPG image format
  • URL:
DeepJSCC-f: Deep Joint Source-Channel Coding of Images With Feedback
To the best of the knowledge, this is the first practical JSCC scheme that can fully exploit channel output feedback, demonstrating yet another setting in which modern machine learning techniques can enable the design of new and efficient communication methods that surpass the performance of traditional structured coding-based designs.
Multiscale structural similarity for image quality assessment
This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions, and develops an image synthesis method to calibrate the parameters that define the relative importance of different scales.
Conditional Generative Adversarial Nets
The conditional version of generative adversarial nets is introduced, which can be constructed by simply feeding the data, y, to the generator and discriminator, and it is shown that this model can generate MNIST digits conditioned on class labels.
Nonlinear Transform Source-Channel Coding for Semantic Communications
The proposed NTSCC method can potentially support future semantic communications due to its content-aware ability and perceptual optimization goal and generally outperforms both the analog transmission using the standard deep joint source-channel coding and the classical separation-based digital transmission.
High-Fidelity Generative Image Compression
This work extensively study how to combine Generative Adversarial Networks and learned compression to obtain a state-of-the-art generative lossy compression system and bridges the gap between rate-distortion-perception theory and practice.
Image Quality Assessment: Unifying Structure and Texture Similarity
This work develops the first full-reference image quality model with explicit tolerance to texture resampling, using a convolutional neural network to construct an injective and differentiable function that transforms images to multi-scale overcomplete representations.
The Open Images Dataset V4
In-depth comprehensive statistics about the dataset are provided, the quality of the annotations are validated, the performance of several modern models evolves with increasing amounts of training data, and two applications made possible by having unified annotations of multiple types coexisting in the same images are demonstrated.
Generative Adversarial Networks for Extreme Learned Image Compression
If a semantic label map of the original image is available, the learned image compression system can fully synthesize unimportant regions in the decoded image such as streets and trees from the label map, proportionally reducing the storage cost.