Corpus ID: 173991164

Functional Adversarial Attacks

@inproceedings{Laidlaw2019FunctionalAA,
  title={Functional Adversarial Attacks},
  author={Cassidy Laidlaw and Soheil Feizi},
  booktitle={NeurIPS},
  year={2019}
}
We propose functional adversarial attacks, a novel class of threat models for crafting adversarial examples to fool machine learning models. [...] Key Method We show that functional threat models can be combined with existing additive ($\ell_p$) threat models to generate stronger threat models that allow both small, individual perturbations and large, uniform changes to an input. Moreover, we prove that such combinations encompass perturbations that would not be allowed in either constituent threat model. In…Expand
Localized Uncertainty Attacks
The susceptibility of deep learning models to adversarial perturbations has stirred renewed attention in adversarial examples resulting in a number of attacks. However, most of these attacks fail toExpand
Learning to Generate Noise for Robustness against Multiple Perturbations
TLDR
This work proposes a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks, and proposes Meta Noise Generator (MNG) that outputs optimal noise to stochastically perturb a given sample, which helps lower the error on diverse adversarial perturbations. Expand
Learning Defense Transformers for Counterattacking Adversarial Examples
TLDR
This paper theoretically and empirically verify the existence of defense affine transformations that restore adversarial examples and learns a defense transformer to counterattack the adversarialExamples by parameterizing the affine Transformations and exploiting the boundary information of DNNs. Expand
Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks
TLDR
This work proposes the first defense with certified robustness against Wasserstein Adversarial attacks using randomized smoothing, and develops this certificate by considering the space of possible flows between images, and representing this space such thatWasserstein distance between images is upper-bounded by L_1 distance in this flow-space. Expand
Generalizing Adversarial Training to Composite Semantic Perturbations
  • Yun-Yun Tsai, Lei Hsiung, Pin-Yu Chen, Tsung-Yi Ho
  • 2021
Model robustness against adversarial examples has been widely studied, yet the lack of generalization to more realistic scenarios can be challenging. Specifically, recent works using adversarialExpand
GreedyFool: Distortion-Aware Sparse Adversarial Attack
TLDR
This paper proposes a novel two-stage distortion-aware greedy-based method dubbed as GreedyFool, which first selects the most effective candidate positions to modify by considering both the gradient and the distortion map for invisibility, then drops some less important points in the reduce stage. Expand
Lagrangian Objective Function Leads to Improved Unforeseen Attack Generalization in Adversarial Training
TLDR
This paper proposes a simple modification to the AT that mitigates the perturbation `p norm while maximizing the classification loss in the Lagrangian form and argues that crafting adversarial examples based on this scheme results in enhanced attack generalization in the learned model. Expand
Semantics Preserving Adversarial Examples
While progress has been made in crafting adversarial examples with visually imperceivable changes, constructing semantically meaningful ones remains a challenge. In this paper, we propose a frameworkExpand
Stronger and Faster Wasserstein Adversarial Attacks
TLDR
An exact yet efficient projection operator is developed to enable a stronger projected gradient attack and the Frank-Wolfe method equipped with a suitable linear minimization oracle works extremely fast under Wasserstein constraints, improving the robustness of adversarially trained models. Expand
Sample Efficient Detection and Classification of Adversarial Attacks via Self-Supervised Embeddings
TLDR
A self-supervised method to detect adversarial attacks and classify them to their respective threat models, based on a linear model operating on the embeddings from a pre-trained self- supervised encoder, called SimCat. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 36 REFERENCES
Constructing Unrestricted Adversarial Examples with Generative Models
TLDR
The empirical results on the MNIST, SVHN, and CelebA datasets show that unrestricted adversarial examples can bypass strong adversarial training and certified defense methods designed for traditional adversarial attacks. Expand
Quantifying Perceptual Distortion of Adversarial Examples
TLDR
This work presents and employs a unifying framework fusing different attack styles to demonstrate the value of quantifying the perceptual distortion of adversarial examples, and performs adversarial training using attacks generated by the framework to demonstrate that networks are only robust to classes of adversarian perturbations they have been trained against. Expand
Semantic Adversarial Examples
TLDR
This paper introduces a new class of adversarial examples, namely "Semantic Adversarial Examples," as images that are arbitrarily perturbed to fool the model, but in such a way that the modified image semantically represents the same object as the original image. Expand
Towards Deep Learning Models Resistant to Adversarial Attacks
TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee. Expand
Wasserstein Adversarial Examples via Projected Sinkhorn Iterations
TLDR
A new threat model for adversarial attacks based on the Wasserstein distance is proposed, which can successfully attack image classification models, and it is demonstrated that PGD-based adversarial training can improve this adversarial accuracy to 76%. Expand
Spatially Transformed Adversarial Examples
TLDR
Perturbations generated through spatial transformation could result in large $\mathcal{L}_p$ distance measures, but the extensive experiments show that such spatially transformed adversarial examples are perceptually realistic and more difficult to defend against with existing defense systems. Expand
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
TLDR
This paper motivates the use of adversarial risk as an objective, although it cannot easily be computed exactly, and frames commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarialrisk. Expand
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
TLDR
The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs. Expand
Adversarial Attacks Beyond the Image Space
TLDR
Though image-space adversaries can be interpreted as per-pixel albedo change, it is verified that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect. Expand
The Limitations of Adversarial Training and the Blind-Spot Attack
TLDR
It is shown that the effectiveness of adversarial training has a strong correlation with the distance between a test point and the manifold of training data embedded by the network, and blind-spots also exist on provable defenses including (Wong & Kolter, 2018) and (Sinha et al., 2018). Expand
...
1
2
3
4
...