Adversarial Examples Detection Beyond Image Space

  title={Adversarial Examples Detection Beyond Image Space},
  author={Kejiang Chen and Yuefeng Chen and Hang Zhou and Chuan Qin and Xiaofeng Mao and Weiming Zhang and Nenghai Yu},
  journal={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  • Kejiang ChenYuefeng Chen Nenghai Yu
  • Published 23 February 2021
  • Computer Science
  • ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Deep neural networks have been proved that they are vulnerable to adversarial examples, which are generated by adding human-imperceptible perturbations to images. To defend these adversarial examples, various detection based methods have been proposed. However, most of them perform poorly on detecting adversarial examples with extremely slight perturbations. By exploring these adversarial examples, we find that there exists compliance between perturbations and prediction confidence, which… 

Figures and Tables from this paper

NoiLIn: Improving Adversarial Training and Correcting Stereotype of Noisy Labels

A simple but effective method—NoiLIn that randomly injects NLs into training data at each training epoch and dynamically increases the NL injection rate once robust overfitting occurs can mitigate the AT’s undesirable issue of robust over-tting and improve the generalization of the state-of-the-art AT methods.

Randomized Smoothing Under Attack: How Good is it in Practice?

The main observation is that there is a major mismatch in the settings of the RS for obtaining high certified robustness or when defeating black box attacks while preserving the classifier accuracy.

Detecting and Recovering Adversarial Examples from Extracting Non-robust and Highly Predictive Adversarial Perturbations

A model-free AEs detection method based on high-dimension perturbation extraction, which can not only detect the adversarial examples with high accuracy, but also detect the specific category of the AEs.



Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction

This paper proposes a straightforward method for detecting adversarial image examples, which can be directly deployed into unmodified off-the-shelf DNN models and raises the bar for defense-aware attacks.

Towards Robust Detection of Adversarial Examples

This paper presents a novel training procedure and a thresholding test strategy, towards robust detection of adversarial examples, and proposes to minimize the reverse cross-entropy (RCE), which encourages a deep network to learn latent representations that better distinguish adversarialExamples from normal ones.

Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View

Steganalysis can be applied to adversarial examples detection, and a method to enhance steganalysis features by estimating the probability of modifications caused by adversarial attacks is proposed.

Detecting Adversarial Samples from Artifacts

This paper investigates model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model, and results show a method for implicit adversarial detection that is oblivious to the attack algorithm.

Defending Against Universal Perturbations With Shared Adversarial Training

This work shows that adversarial training is more effective in preventing universal perturbations, where the same perturbation needs to fool a classifier on many inputs, and investigates the trade-off between robustness against universal perturbed data and performance on unperturbed data.

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

It is concluded that adversarialExamples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not.

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

Two feature squeezing methods are explored: reducing the color bit depth of each pixel and spatial smoothing, which are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

The authors' elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications ofDNNs.

Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses

An efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low L2 norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image.

Procedural Noise Adversarial Examples for Black-Box Attacks on Deep Convolutional Networks

This paper introduces a structured approach for generating Universal Adversarial Perturbations (UAPs) with procedural noise functions, and unveils the systemic vulnerability of popular DCN models like Inception v3 and YOLO v3, with single noise patterns able to fool a model on up to 90% of the dataset.