• Corpus ID: 51757341

Decision Boundary Analysis of Adversarial Examples

@inproceedings{He2018DecisionBA,
  title={Decision Boundary Analysis of Adversarial Examples},
  author={Warren He and Bo Li and Dawn Xiaodong Song},
  booktitle={ICLR},
  year={2018}
}
Deep neural networks (DNNs) are vulnerable to adversarial examples, which are carefully crafted instances aiming to cause prediction errors for DNNs. Recent research on adversarial examples has examined local neighborhoods in the input space of DNN models. However, previous work has limited what regions to consider, focusing either on low-dimensional subspaces or small balls. In this paper, we argue that information from larger neighborhoods, such as from more directions and from greater… 
Towards Understanding and Improving the Transferability of Adversarial Examples in Deep Neural Networks
TLDR
This work empirically investigates two classes of factors that might influence the transferability of adversarial examples, including model-specific factors, including network architecture, model capacity and test accuracy, and proposes a simple but effective strategy to improve the transferable.
Adversarial Adaptive Neighborhood With Feature Importance-Aware Convex Interpolation
TLDR
A new method is introduced, which uses correct predicted samples in disjoint classes to guide the generation of more explainable adversarial samples in the ambiguous region around the decision boundary instead of uncontrolled “blind spots”, via convex combination in a feature component-wise manner which takes the individual importance of feature ingredients into account.
Understanding and Enhancing the Transferability of Adversarial Examples
TLDR
This work systematically study how two classes of factors that might influence the transferability of adversarial examples are influenced, including model-specific factors, including network architecture, model capacity and test accuracy, and the local smoothness of loss function for constructing adversarial example.
Adversarial Example Detection and Classification With Asymmetrical Adversarial Training
TLDR
This paper presents an adversarial example detection method that provides performance guarantee to norm constrained adversaries, and uses the learned class conditional generative models to define generative detection/classification models that are both robust and more interpretable.
Understanding the Decision Boundary of Deep Neural Networks: An Empirical Study
TLDR
The minimum distance of data points to the decision boundary and how this margin evolves over the training of a deep neural network is studied to observe that the decision Boundary moves closer to natural images over training.
Benchmarking Adversarial Robustness on Image Classification
  • Yinpeng Dong, Qi-An Fu, Jun Zhu
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
A comprehensive, rigorous, and coherent benchmark to evaluate adversarial robustness on image classification tasks is established and several important findings are drawn that can provide insights for future research.
Bamboo: Ball-Shape Data Augmentation Against Adversarial Attacks from All Directions
TLDR
Bamboo is proposed – the first data augmentation method designed for improving the general robustness of DNN without any hypothesis on the attacking algorithms, achieving better results comparing to previous adversarial training methods, robust optimization methods and otherData augmentation methods with the same amount of data points.
Hold me tight! Influence of discriminative features on deep network boundaries
TLDR
This work rigorously confirms that neural networks exhibit a high invariance to non-discriminative features, and shows that the decision boundaries of a DNN can only exist as long as the classifier is trained with some features that hold them together.
Early Layers Are More Important For Adversarial Robustness
TLDR
A novel method to measure and attribute adversarial effectiveness to each layer, based on partial adversarial training, finds that, while all layers in an adversarially trained network contribute to robustness, earlier layers play a more crucial role.
Improving Calibration through the Relationship with Adversarial Robustness
TLDR
Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS) is proposed that integrates the correlations of adversarial robustness and calibration into training by adaptively softening labels for an example based on how easily it can be attacked by an adversary.
...
...

References

SHOWING 1-10 OF 16 REFERENCES
The Space of Transferable Adversarial Examples
TLDR
It is found that adversarial examples span a contiguous subspace of large (~25) dimensionality, which indicates that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.
Detecting Adversarial Samples from Artifacts
TLDR
This paper investigates model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model, and results show a method for implicit adversarial detection that is oblivious to the attack algorithm.
Explaining and Harnessing Adversarial Examples
TLDR
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Delving into Transferable Adversarial Examples and Black-box Attacks
TLDR
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.
Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification
TLDR
This work develops new DNNs that are robust to state-of-the-art evasion attacks, and proposes region-based classification to be robust to adversarial examples.
Towards Deep Learning Models Resistant to Adversarial Attacks
TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Exploring the space of adversarial images
  • Pedro Tabacof, E. Valle
  • Computer Science
    2016 International Joint Conference on Neural Networks (IJCNN)
  • 2016
TLDR
This work formalizes the problem of adversarial images given a pretrained classifier, showing that even in the linear case the resulting optimization problem is nonconvex and that a shallow classifier seems more robust to adversarial pictures than a deep convolutional network.
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
TLDR
It is concluded that adversarialExamples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not.
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
TLDR
Clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly is given and several new streamlined architectures for both residual and non-residual Inception Networks are presented.
...
...