• Corpus ID: 239998354

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

  title={ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers},
  author={Husheng Han and Kaidi Xu and Xing Hu and Xiaobing Chen and Ling Liang and Zidong Du and Qi Guo and Yanzhi Wang and Yunji Chen},
Adversarial patch attacks that craft the pixels in a confined region of the input images show their powerful attack effectiveness in physical environments even with noises or deformations. Existing certified defenses towards adversarial patch attacks work well on small images like MNIST and CIFAR-10 datasets, but achieve very poor certified accuracy on higher-resolution images like ImageNet. It is urgent to design both robust and effective defenses against such a practical and harmful attack in… 

Figures and Tables from this paper

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

PatchVeto is proposed, a zero-shot certified defense against adversarial patches based on Vision Transformer (ViT) models that can achieve high accuracy on clean inputs while detecting adversarial patched inputs by simply manipulating the attention map of ViT.

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

It is proved that PatchCleanser will always predict the correct class labels on certain images against any adaptive white-box attacker within the authors' threat model, achieving certified robustness.

Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation

D EMASKED S MOOTHING is presented, the first approach to certify the robustness of semantic segmentation models against this threat model and can on average certify 64% of the pixel predictions for a 1% patch in the detection task and 48% against a 0.5% patch for the recovery task on the ADE20K dataset.

SoK: Certified Robustness for Deep Neural Networks

This paper provides a taxonomy for the robustness verification and training approaches, and provides an open-sourced unified platform to evaluate 20+ representative verification and corresponding robust training approaches on a wide range of DNNs.

ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking

A certification procedure is developed to determine if ObjectSeeker can detect certain objects with a provable guarantee against any adaptive attacker within the threat model, and demonstrates a significant improvement in certified robustness over the prior work.

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

This work proposes a novel solution to strategically manipulate neurons, by “grafting” appropriate levels of linearity, to first linearize insignificant ReLU neurons, and optimize the associated slopes and intercepts of the replaced linear activations for restoring model performance while maintaining certifiability.

Can pruning improve certified robustness of neural networks?

It is demonstrated for the first time that pruning can generally improve certified robustness for ReLU-based NNs under the complete veri-cation setting, and a new stability-based pruning method tailored for reducing neuron instability is presented, that outperforms existing pruning methods in enhancing certi fiing robustness.

ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers

This paper provides the very first study on developing certified detection against the dual patch attack, in which the attacker is allowed to adversarially manipulate pixels in two different regions.



PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

This paper proposes a general defense framework called PatchGuard that can achieve high provable robustness while maintaining high clean accuracy against localized adversarial patches, and presents the robust masking defense that robustly detects and masks corrupted features to recover the correct prediction.

Local Gradients Smoothing: Defense Against Localized Adversarial Attacks

This work has developed an effective method to estimate noise location in gradient domain and transform those high activation regions caused by adversarial noise in image domain while having minimal effect on the salient object that is important for correct classification.

(De)Randomized Smoothing for Certifiable Defense against Patch Attacks

A certifiable defense against patch attacks that guarantees for a given image and patch attack size, no patch adversarial examples exist, and is related to the broad class of randomized smoothing robustness schemes which provide high-confidence probabilistic robustness certificates.

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

This paper extends PatchGuard toPatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy and demonstrates that PatchGuard++ significantly improves the provably robustness and clean performance.

LanCe: A Comprehensive and Lightweight CNN Defense Methodology against Physical Adversarial Attacks on Embedded Multimedia Applications

This paper proposes LanCe – a comprehensive and lightweight CNN defense methodology against different physical adversarial attacks, finding that non-semantic adversarial perturbations can activate CNN with significantly abnormal activations and even overwhelm other semantic input patterns’ activations.

Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis

A generic defense system called TaintRadar is proposed to accurately detect localized adversarial examples via analyzing critical regions that have been manipulated by attackers, and can effectively capture sophisticated localized partial attacks, e.g., the eye-glasses attack.

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

This work develops a more general attack model, i.e., the structured attack (StrAttack), which explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key spatial structures through adversarial saliency map and class activation map.

DoPa: A Comprehensive CNN Detection Methodology against Physical Adversarial Attacks

It is found that non-semantic adversarial perturbations can activate CNN with significantly abnormal activations and even overwhelm other semantic input patterns' activations, so a self-verification stage is added to analyze the semantics of distinguished activation patterns, which improves the CNN recognition process.

Scalable Verified Training for Provably Robust Image Classification

This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of IMAGENET.

Adversarial T-Shirt! Evading Person Detectors in a Physical World

This is the first work that models the effect of deformation for designing physical adversarial examples with respect to-rigid objects such as T-shirts and shows that the proposed method achieves74% and 57% attack success rates in the digital and physical worlds respectively against YOLOv2 and Faster R-CNN.