• Corpus ID: 239998230

Improving Local Effectiveness for Global robust training

  title={Improving Local Effectiveness for Global robust training},
  author={Jingyue Lu and M. Pawan Kumar},
  • Jingyue Lu, M. Pawan Kumar
  • Published 26 October 2021
  • Computer Science
  • ArXiv
Many successful robust training methods rely on strong adversaries, which can be prohibitively expensive to generate when the input dimension is high and the model structure is complicated. We adopt a new perspective on robustness and propose a novel training algorithm that allows a more effective use of adversaries. Our method improves the model robustness at each local ball centered around an adversary and then, by combining these local balls through a global term, achieves overall robustness… 


Adversarial Robustness through Local Linearization
A novel regularizer is introduced that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness and shows via extensive experiments on CIFAR-10 and ImageNet, that models trained with this regularizer avoid gradient obfuscations and can be trained significantly faster than adversarial training.
Fast is better than free: Revisiting adversarial training
It is made the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice.
Robustness via Curvature Regularization, and Vice Versa
It is shown in particular that adversarial training leads to a significant decrease in the curvature of the loss surface with respect to inputs, leading to a drastically more "linear" behaviour of the network.
Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization
This work suggests a theoretically inspired novel approach to improve the networks' robustness using the Frobenius norm of the Jacobian of the network, which is applied as post-processing, after regular training has finished and demonstrates empirically that it leads to enhanced robustness results with a minimal change in the original network's accuracy.
You Only Propagate Once: Painless Adversarial Training Using Maximal Principle
This work fully exploits structure of deep neural networks and proposes a novel strategy to decouple the adversary update with the gradient back propagation, which avoids forward and backward propagating the data too many times in one iteration, and restricts core descent directions computation to the first layer of the network, thus speeding up every iteration significantly.
Improving Adversarial Robustness Requires Revisiting Misclassified Examples
This paper proposes a new defense algorithm called MART, which explicitly differentiates the misclassified and correctly classified examples during the training, and shows that MART and its variant could significantly improve the state-of-the-art adversarial robustness.
Adversarial Training for Free!
This work presents an algorithm that eliminates the overhead cost of generating adversarial examples by recycling the gradient information computed when updating model parameters, and achieves comparable robustness to PGD adversarial training on the CIFAR-10 and CIFar-100 datasets at negligible additional cost compared to natural training.
Analysis of universal adversarial perturbations
It is shown that the robustness of deep networks to universal perturbations is driven by a key property of their curvature: there exists shared directions along which the decision boundary ofDeep networks is systematically positively curved.
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
It is concluded that adversarialExamples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not.
Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients
It is demonstrated that regularizing input gradients makes them more naturally interpretable as rationales for model predictions, and also exhibits robustness to transferred adversarial examples generated to fool all of the other models.