• Corpus ID: 231846909

Efficient Certified Defenses Against Patch Attacks on Image Classifiers

  title={Efficient Certified Defenses Against Patch Attacks on Image Classifiers},
  author={Jan Hendrik Metzen and Maksym Yatsura},
Adversarial patches pose a realistic threat model for physical world attacks on autonomous systems via their perception component. Autonomous systems in safety-critical domains such as automated driving should thus contain a fail-safe fallback component that combines certifiable robustness against patches with efficient inference while maintaining high performance on clean inputs. We propose BAGCERT, a novel combination of model architecture and certification procedure that allows efficient… 

Figures and Tables from this paper

Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation

D EMASKED S MOOTHING is presented, the first approach to certify the robustness of semantic segmentation models against this threat model and can on average certify 64% of the pixel predictions for a 1% patch in the detection task and 48% against a 0.5% patch for the recovery task on the ADE20K dataset.

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

PatchVeto is proposed, a zero-shot certified defense against adversarial patches based on Vision Transformer (ViT) models that can achieve high accuracy on clean inputs while detecting adversarial patched inputs by simply manipulating the attention map of ViT.

Towards Practical Certifiable Patch Defense with Vision Transformer

A progressive smoothed image modeling task to train Vision Transformer, which can capture the more discriminable local context of an image while preserving the global semantic information and obtains state-of-the-art clean and certified accuracy with inferring efficiently on CIFAR-10 and ImageNet.

Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey

An overview of existing techniques of adversarial patch attacks is provided to help interested researchers quickly catch up with the progress, and existing techniques for developing detection and defences against adversarial patches are discussed to help the community better understand this type of attack and its applications in the real world.

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

It is proved that PatchCleanser will always predict the correct class labels on certain images against any adaptive white-box attacker within the authors' threat model, achieving certified robustness.

PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

This paper proposes a general defense framework called PatchGuard that can achieve high provable robustness while maintaining high clean accuracy against localized adversarial patches, and presents the robust masking defense that robustly detects and masks corrupted features to recover the correct prediction.

Certified defense against patch attacks via mask-guided randomized smoothing

By adding Gaussian noises only on the patch region with a mask, it is proved that a stronger certificate with high confldence can be achieved by randomized smoothing, and a practical scheme based on joint voting to certify the patch with a high probability and certify it effectively is designed.

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

This paper extends PatchGuard toPatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy and demonstrates that PatchGuard++ significantly improves the provably robustness and clean performance.

ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers

This paper provides the very first study on developing certified detection against the dual patch attack, in which the attacker is allowed to adversarially manipulate pixels in two different regions.


This paper extends PatchGuard toPatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy and demonstrates that PatchGuard++ significantly improves the provably robustness and clean performance.



Certified Defenses for Adversarial Patches

An extension of certified defense algorithms is presented and a significantly faster variants for robust training against patch attacks are proposed, observing that robustness to such attacks transfers surprisingly well.

Defending Against Physically Realizable Attacks on Image Classification

A new abstract adversarial model is proposed, rectangular occlusion attacks, in which an adversary places a small adversarially crafted rectangle in an image, and two approaches for efficiently computing the resulting adversarial examples are developed.

On Physical Adversarial Patches for Object Detection

This paper demonstrates a physical adversarial patch attack against object detectors, notably the YOLOv3 detector, and shows that a properly designed patch can suppress virtually all the detected objects in the image.

DPATCH: An Adversarial Patch Attack on Object Detectors

Extensive evaluations imply that DPatch can perform effective attacks under black-box setup, i.e., even without the knowledge of the attacked network's architectures and parameters, making it very practical to implement real-world attacks.

Certified Robustness to Adversarial Examples with Differential Privacy

This paper presents the first certified defense that both scales to large networks and datasets and applies broadly to arbitrary model types, based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism.

Safety Verification of Deep Neural Networks

A novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT) is developed, which defines safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image.

Certified Adversarial Robustness with Additive Noise

This work establishes a connection between robustness against adversarial perturbation and additive random noise, and proposes a training strategy that can significantly improve the certified bounds.

Local Gradients Smoothing: Defense Against Localized Adversarial Attacks

This work has developed an effective method to estimate noise location in gradient domain and transform those high activation regions caused by adversarial noise in image domain while having minimal effect on the salient object that is important for correct classification.

On Visible Adversarial Perturbations & Digital Watermarking

  • Jamie Hayes
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2018
Under this threat model, adversarial perturbations transform images such that the model's output is classified as an attacker chosen class and attacks that can bypass these defenses are discussed.

Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection

The goal is to generate a patch that is able to successfully hide a person from a person detector, and this work is the first to attempt this kind of attack on targets with a high level of intra-class variety like persons.