On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures

@article{Poyser2021OnTI,
  title={On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures},
  author={Matt Poyser and Amir Atapour-Abarghouei and T. Breckon},
  journal={2020 25th International Conference on Pattern Recognition (ICPR)},
  year={2021},
  pages={2830-2837}
}
Recent advances in generalized image understanding have seen a surge in the use of deep convolutional neural networks (CNN) across a broad range of image-based detection, classification and prediction tasks. Whilst the reported performance of these approaches is impressive, this study investigates the hitherto unapproached question of the impact of commonplace image and video compression techniques on the performance of such deep learning architectures. Focusing on the JPEG and H.264 (MPEG-4… 

Figures and Tables from this paper

First Gradually, Then Suddenly: Understanding the Impact of Image Compression on Object Detection Using Deep Learning
TLDR
This paper investigates the impact of image compression on the performance of object detection methods based on convolutional neural networks and focuses on Joint Photographic Expert Group (JPEG) compression and thoroughly analyze a range of the performance metrics.
Boosting Neural Image Compression for Machines Using Latent Space Masking
TLDR
LSMnet is proposed, a network that runs in parallel to the encoder network and masks out elements of the latent space that are presumably not required for the analysis network and a feature-based loss, which allows for a training without annotated data.
Operationalizing Convolutional Neural Network Architectures for Prohibited Object Detection in X-Ray Imagery
TLDR
This work explores the viability of two recent end-to-end object detection CNN architectures, Cascade R-CNN and FreeAnchor, for prohibited item detection by balancing processing time and the impact of image data compression from an operational viewpoint.
TACTIC: Joint Rate-Distortion-Accuracy Optimisation for Low Bitrate Compression
TLDR
This work presents TACTIC: Task-Aware Compression Through Intelligent Coding, a lossy compression model learns based on the rate-distortion-accuracy trade-off for a specific task that is able to improve the accuracy of ImageNet subset classification.
Impact of Image Compression on the Performance of Steel Surface Defect Classification with a CNN
TLDR
The aim of this paper was to study the impact of quality degradation resulting from image compression on the classification performance of steel surface defects with a CNN, and found that compression-based data augmentation significantly increased the classification precision to perfect scores, and thus improved the generalization of models when tested on different compression qualities.
Closed-Loop Region of Interest Enabling High Spatial and Temporal Resolutions in Object Detection and Tracking via Wireless Camera
TLDR
This paper systematically characterize the effects of ROI on camera capturing, data transmission, and image processing, and presents the closed-loop ROI algorithm capable of high spatial and temporal resolution as well as wide scanning field of view (FOV) in single and multi-object detection and tracking via real-time wireless video streaming.
Lost in Compression: the Impact of Lossy Image Compression on Variable Size Object Detection within Infrared Imagery
Lossy image compression strategies allow for more efficient storage and transmission of data by encoding data to a reduced form. This is essential enable training with larger datasets on less
Exponentiated Gradient Reweighting for Robust Training Under Label Noise and Beyond
TLDR
Inspired by the expert setting in on-line learning, this work presents a flexible approach to learning from noisy examples that treats each training example as an expert and maintains a distribution over all examples.

References

SHOWING 1-10 OF 29 REFERENCES
Two-Stream Convolutional Networks for Action Recognition in Videos
TLDR
This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Understanding how image quality affects deep neural networks
  • Samuel F. Dodge, Lina Karam
  • Computer Science
    2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX)
  • 2016
TLDR
An evaluation of 4 state-of-the-art deep neural network models for image classification under quality distortions shows that the existing networks are susceptible to these quality distortions, particularly to blur and noise.
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
TLDR
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Impact of JPEG 2000 compression on deep convolutional neural networks for metastatic cancer detection in histopathological images
TLDR
The impact of JPEG 2000 compression on the proposed CNN-based algorithm, which has produced performance comparable to that of pathologists and which was ranked second place in the CAMELYON17 challenge, is studied.
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Face detection directly from h.264 compressed video with convolutional neural network
  • ShinShan Zhuang, S. Lai
  • Computer Science
    2009 16th IEEE International Conference on Image Processing (ICIP)
  • 2009
TLDR
A novel face detection algorithm based on a convolutional neural network architecture that can rapidly detect human face regions in video sequences encoded by H.264/AVC is proposed.
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition
Compressed domain human action recognition in H.264/AVC video streams
This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors
...
1
2
3
...