Corpus ID: 232478602

TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness

  title={TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness},
  author={Zhuolin Yang and Linyi Li and Xiaojun Xu and Shiliang Zuo and Qian Chen and Benjamin I. P. Rubinstein and Ce Zhang and Bo Li},
Adversarial Transferability is an intriguing property of adversarial examples – a perturbation that is crafted against one model is also effective against another model, which may arise from a different model family or training process. To better protect ML systems against such adversarial attacks, several questions are raised: what are the sufficient conditions for adversarial transferability? Is it possible to bound such transferability? Is there a way to reduce the transferability in order… Expand
On the Certified Robustness for Ensemble Models and Beyond
It is proven that diversified gradient and large confidence margin are sufficient and necessary conditions for certifiably robust ensemble models under the model-smoothness assumption, and it is proved that an ensemble model can always achieve higher certified robustness than a single base model under mild conditions. Expand
A Little Robustness Goes a Long Way: Leveraging Universal Features for Targeted Transfer Attacks
It is shown that training the source classifier to be “slightly robust”—that is, robust to small-magnitude adversarial examples—substantially improves the transferability of targeted attacks, even between architectures as different as convolutional neural networks and transformers. Expand


Delving into Transferable Adversarial Examples and Black-box Attacks
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels. Expand
DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles
DVERGE is proposed, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features, and diversifies the adversarian vulnerability to induce diverse outputs against a transfer attack, and enables the improved robustness when more sub-models are added to the ensemble. Expand
Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks
This paper provides a unifying optimization framework for evasion and poisoning attacks, and a formal definition of transferability of such attacks, highlighting two main factors contributing to attack transferability: the intrinsic adversarial vulnerability of the target model, and the complexity of the surrogate model used to optimize the attack. Expand
Boosting Adversarial Attacks with Momentum
A broad class of momentum-based iterative algorithms to boost adversarial attacks by integrating the momentum term into the iterative process for attacks, which can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. Expand
Towards Evaluating the Robustness of Neural Networks
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced. Expand
Analysis of classifiers’ robustness to adversarial perturbations
A general upper bound on the robustness of classifiers to adversarial perturbations is established, and the phenomenon of adversarial instability is suggested to be due to the low flexibility ofclassifiers, compared to the difficulty of the classification task (captured mathematically by the distinguishability measure). Expand
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
The proposed Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against adversarial perturbations, is empirically shown to be consistently effective against different attack methods and improves on existing defense strategies. Expand
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
Elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art$L_2$ attack as a special case, suggesting novel insights on leveraging $L-1$ distortion in adversarial machine learning and security implications ofDNNs. Expand
Feature Cross-Substitution in Adversarial Classification
This work investigates both the problem of modeling the objectives of adversaries, as well as the algorithmic problem of accounting for rational, objective-driven adversaries, and presents the first method for combining an adversarial classification algorithm with a very general class of models of adversarial classifier evasion. Expand
The Limitations of Deep Learning in Adversarial Settings
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. Expand