Corpus ID: 209444850

# Jacobian Adversarially Regularized Networks for Robustness

@article{Chan2020JacobianAR,
title={Jacobian Adversarially Regularized Networks for Robustness},
author={Alvin Chan and Yi Tay and Yew Soon Ong and Jie Fu},
journal={ArXiv},
year={2020},
volume={abs/1912.10185}
}
Adversarial examples are crafted with imperceptible perturbations with the intent to fool neural networks. Against such attacks, adversarial training and its variants stand as the strongest defense to date. Previous studies have pointed out that robust models that have undergone adversarial training tend to produce more salient and interpretable Jacobian matrices than their non-robust counterparts. A natural question is whether a model trained with an objective to produce salient Jacobian can… Expand

#### Figures, Tables, and Topics from this paper

LAYER-WISE ADVERSARIAL DEFENSE: AN ODE PER-
• 2020
Deep neural networks are observed to be fragile against adversarial attacks, which have dramatically limited their practical applicability. On improving model robustness, the adversarial trainingExpand
Generating transferable adversarial examples based on perceptually-aligned perturbation
• Computer Science
• 2021
A novel AE generating method that can achieve state-of-the-art performance in terms of fooling the robust models by combining the method with diverse inputs method and momentum iterative fast gradient sign method. Expand
Improving Model Robustness with Latent Distribution Locally and Globally
• Zhuang Qian, Kaizhu Huang, Qiufeng Wang, Rui Zhang
• Computer Science
• ArXiv
• 2021
This work proposes a novel adversarial training method through robust optimization, and a tractable way to generate Latent Manifold Adversarial Examples (LMAEs) via an adversarial game between a discriminator and a classifier. Expand
• Computer Science
• ArXiv
• 2020
Inspired by gradient sampling techniques from non-convex optimization, this work proposes to calculate the gradient direction of the adversarial attack as the weighted average over multiple points in the local vicinity, which increases their success rate and thus leads to more accurate robustness estimates. Expand
Attacking Adversarial Attacks as A Defense
• Boxi Wu, +6 authors Wei Liu
• Computer Science
• ArXiv
• 2021
This work finds that the adversarial attacks can also be vulnerable to small perturbation, and proposes to counter attacks by crafting more effective defensive perturbations, which leverage the advantage that adversarial training endows the ground-truth class with smaller local Lipschitzness. Expand
Noise Modulation: Let Your Model Interpret Itself
• Haoyang Li, Xinggang Wang
• Computer Science
• ArXiv
• 2021
Noise modulation is proposed as an efficient and model-agnostic alternative to train a model that interprets itself with input-gradients and results show that noise modulation can effectively increase the interpretability of input- gradients model-gnosticly. Expand
• Keke Du, Shan Chang, Huixiang Wen, Hao Zhang
• Computer Science
• ACM TUR-C
• 2021
Two methods are used to add gradient-dependent constraint terms to the loss function of neural network models and both improve the robustness of the models and improve the interpretability of model’s gradients. Expand
Impact of Spatial Frequency Based Constraints on Adversarial Robustness
• Rémi Bernhard, +4 authors M. Reyboz
• Computer Science
• 2021 International Joint Conference on Neural Networks (IJCNN)
• 2021
The robustness to adversarial perturbations of models enforced during training to leverage information corresponding to different spatial frequency ranges is investigated and it is shown that it is tightly linked to the spatial frequency characteristics of the data at stake. Expand
DSRNA: Differentiable Search of Robust Neural Architectures
• Computer Science
• CVPR
• 2021
Methods to perform differentiable search of robust neural architectures by maximizing robustness metrics to harvest robust architectures are proposed and shown to be more robust to various norm-bound attacks than several robust NAS baselines. Expand
Fighting fire with fire: A spatial–frequency ensemble relation network with generative adversarial learning for adversarial image classification
• Wenbo Zheng, Fei-Yue Wang
• Computer Science
• Int. J. Intell. Syst.
• 2021
A novel spatial–frequency ensemble relation network with generative adversarial learning that designs a discriminator model that determines whether an adversarial image is from the matching category or not and achieves significantly higher performance compared with other state‐of‐the‐arts approaches. Expand

#### References

SHOWING 1-10 OF 33 REFERENCES
A novel regularizer is introduced that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness and shows via extensive experiments on CIFAR-10 and ImageNet, that models trained with this regularizer avoid gradient obfuscations and can be trained significantly faster than adversarial training. Expand
Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients
• Computer Science, Mathematics
• AAAI
• 2018
It is demonstrated that regularizing input gradients makes them more naturally interpretable as rationales for model predictions, and also exhibits robustness to transferred adversarial examples generated to fool all of the other models. Expand
Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization
• Computer Science, Mathematics
• ECCV
• 2018
This work suggests a theoretically inspired novel approach to improve the networks' robustness using the Frobenius norm of the Jacobian of the network, which is applied as post-processing, after regular training has finished and demonstrates empirically that it leads to enhanced robustness results with a minimal change in the original network's accuracy. Expand
Adversarial Examples Are a Natural Consequence of Test Error in Noise
• Computer Science, Mathematics
• ICML
• 2019
It is suggested that improving adversarial robustness should go hand in hand with improving performance in the presence of more general and realistic image corruptions, and that future adversarial defenses consider evaluating the robustness of their methods to distributional shift with benchmarks such as Imagenet-C. Expand
• Computer Science, Mathematics
• NeurIPS
• 2018
This paper presents a technique for extending these training procedures to much more general networks, with skip connections and general nonlinearities, and shows how to further improve robust error through cascade models. Expand
Adversarial Training and Robustness for Multiple Perturbations
• Computer Science, Mathematics
• NeurIPS
• 2019
It is proved that a trade-off in robustness to different types of $\ell_p$-bounded and spatial perturbations must exist in a natural and simple statistical setting, and questioned the viability and computational scalability of extending adversarial robustness, and adversarial training, to multiple perturbation types. Expand
Semidefinite relaxations for certifying robustness to adversarial examples
• Computer Science, Mathematics
• NeurIPS
• 2018
A new semidefinite relaxation for certifying robustness that applies to arbitrary ReLU networks is proposed and it is shown that this proposed relaxation is tighter than previous relaxations and produces meaningful robustness guarantees on three different foreign networks whose training objectives are agnostic to the proposed relaxation. Expand