@article{Cai2018CurriculumAT,
author={Qi-Zhi Cai and Min Du and Chang Liu and Dawn Xiaodong Song},
journal={ArXiv},
year={2018},
volume={abs/1805.04807}
}
• Published 13 May 2018
• Computer Science
• ArXiv
Recently, deep learning has been applied to many security-sensitive applications, such as facial authentication. The existence of adversarial examples hinders such applications. The state-of-the-art result on defense shows that adversarial training can be applied to train a robust model on MNIST against adversarial examples; but it fails to achieve a high empirical worst-case accuracy on a more complex task, such as CIFAR-10 and SVHN. In our work, we propose curriculum adversarial training…
78 Citations

## Figures and Tables from this paper

• Computer Science
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2020
This paper shows that there is high transferability between models from neighboring epochs in the same training process, i.e., adversarial examples from one epoch continue to be adversarial in subsequent epochs, and proposes a novel method, Adversarial Training with Transferable Adversaria Examples (ATTA), that can enhance the robustness of trained models and greatly improve the training efficiency by accumulating adversarial perturbations through epochs.
Is PGD-Adversarial Training Necessary? Alternative Training via a Soft-Quantization Network with Noisy-Natural Samples Only
• Computer Science
ArXiv
• 2018
Extensive empirical evaluations on standard datasets show that the proposed models are comparable to PGD-adversarially-trained models under PGD and BPDA attacks, and for the first time fine-tunes a robust Imagenet model within only two days.
Confidence-Calibrated Adversarial Training: Towards Robust Models Generalizing Beyond the Attack Used During Training
• Computer Science
ArXiv
• 2019
It is shown that CCAT preserves better the accuracy of normal training while robustness against adversarial examples is achieved via confidence thresholding, and in strong contrast to adversarial training, the robustness of CCAT generalizes to larger perturbations and other threat models, not encountered during training.
• Computer Science
ArXiv
• 2021
The guided interpolation framework (GIF) is proposed: in each epoch, the GIF employs the previous epoch’s meta information to guide the data's interpolation, which mitigates the model's linear behavior between classes and encourages the model to predict invariantly in the cluster of each class.
CONFIDENCE-CALIBRATED ADVERSARIAL TRAINING and Detection: MORE ROBUST MODELS GENERALIZ-
Confidence-calibrated adversarial training (CCAT) is introduced where the key idea is to enforce that the confidence on adversarial examples decays with their distance to the attacked examples, and the robustness of CCAT generalizes to larger perturbations and other threat models, not encountered during training.
Improving Adversarial Robustness Through Progressive Hardening
• Computer Science
ArXiv
• 2020
Adversarial Training with Early Stopping with ATES stabilizes network training even for a large perturbation norm and allows the network to operate at a better clean accuracy versus robustness trade-off curve compared to AT.
• Computer Science, Mathematics
• 2019
Confidence-calibrated adversarial training (CCAT) is introduced where the key idea is to enforce that the confidence on adversarial examples decays with their distance to the attacked examples, and the robustness of CCAT generalizes to larger perturbations and other threat models, not encountered during training.
CE-based white-box adversarial attacks will not work using super-fitting
• Computer Science
ArXiv
• 2022
This paper mathematically proves the ef-fectiveness of super-ﬁtting and enables the model to reach this state quickly by minimizing unrelated category scores (MUCS) and can make the trained model obtain the highest adversarial robustness.
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training
• Computer Science
• 2020
A theoretical framework for adversarial training with FW optimization ( FW-AT) is developed that reveals a geometric connection between the loss landscape and the distortion ofℓ ∞ FW attacks (the attack’s ℓ 2 norm) and analytically shows that high distortion of FW attacks is equivalent to small gradient variation along the attack path.
• Computer Science
ArXiv
• 2021
The Calibrated Adversarial Training is presented, a method that reduces the adverse effects of semantic perturbations in adversarial training and produces pixel-level adaptations to the perturbation based on novel calibrated robust error.

## References

SHOWING 1-10 OF 44 REFERENCES
• Computer Science
ICLR
• 2017
This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.
A General Retraining Framework for Scalable Adversarial Classification
• Computer Science
ArXiv
• 2016
It is shown that, under natural conditions, the retraining framework minimizes an upper bound on optimal adversarial risk, and how to extend this result to account for approximations of evasion attacks.
Towards Evaluating the Robustness of Neural Networks
• Computer Science
2017 IEEE Symposium on Security and Privacy (SP)
• 2017
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
• Computer Science
ArXiv
• 2016
This work introduces the first practical demonstration that cross-model transfer phenomenon enables attackers to control a remotely hosted DNN with no access to the model, its parameters, or its training data, and introduces the attack strategy of fitting a substitute model to the input-output pairs in this manner, then crafting adversarial examples based on this auxiliary model.
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
• Computer Science
2016 IEEE Symposium on Security and Privacy (SP)
• 2016
The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs.
Thermometer Encoding: One Hot Way To Resist Adversarial Examples
• Computer Science
ICLR
• 2018
A simple modification to standard neural network ar3 chitectures, thermometer encoding is proposed, which significantly increases the robustness of the network to adversarial examples, and the proper ties of these networks are explored, providing evidence that thermometer encodings help neural networks to find more-non-linear decision boundaries.
• Computer Science
ICML
• 2018
A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.
• Computer Science
ICLR
• 2018
The experimental results demonstrate that by relaxing the constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion.
Delving into Transferable Adversarial Examples and Black-box Attacks
• Computer Science
ICLR
• 2017
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.