• Corpus ID: 125627987

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

  title={Technical Report on the CleverHans v2.1.0 Adversarial Examples Library},
  author={Nicolas Papernot and Fartash Faghri and Nicholas Carlini and Ian J. Goodfellow and Reuben Feinman and Alexey Kurakin and Cihang Xie and Yash Sharma and Tom B. Brown and Aurko Roy and Alexander Matyasko and Vahid Behzadan and Karen Hambardzumyan and Zhishuai Zhang and Yi-Lin Juang and Zhi Li and Ryan Sheatsley and Abhibhav Garg and Jonathan Uesato and Willi Gierke and Yinpeng Dong and David Berthelot and Paul N. J. Hendricks and Jonas Rauber and Rujun Long and Patrick Mcdaniel},
  journal={arXiv: Learning},
CleverHans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models' performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial example construction are not comparable to each other, because a good result may indicate a robust… 

RobustBench: a standardized adversarial robustness benchmark

This work evaluates robustness of models for their benchmark with AutoAttack, an ensemble of white- and black-box attacks which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications.

MULDEF: Multi-model-based Defense Against Adversarial Examples for Neural Networks

The evaluation results show that MulDef (with only up to 5 models in the family) can substantially improve the target model's accuracy on adversarial examples by 22-74% in a white-box attack scenario, while maintaining similar accuracy on legitimate examples.

Security Matters: A Survey on Adversarial Machine Learning

This paper serves to give a comprehensive introduction to a range of aspects of the adversarial deep learning topic, including its foundations, typical attacking and defending strategies, and some extended studies.

Cryptographic approaches to security and optimization in machine learning

This work creates new security definitions and classifier constructions which allow for an upper bound on the adversarial error that decreases as standard test error decreases, and investigates non-statistical biases and algorithms for nonconvex optimization problems.

The Space of Transferable Adversarial Examples

It is found that adversarial examples span a contiguous subspace of large (~25) dimensionality, which indicates that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.

New CleverHans Feature: Better Adversarial Robustness Evaluations with Attack Bundling

This technical report describes a new feature of the CleverHans library called "attack bundling", which can be used with different prioritization schemes to optimize quantities such as error rate on adversarial examples, perturbation size needed to cause misclassification, or failure rate when using a specific confidence threshold.

On Improving the Effectiveness of Adversarial Training

  • Yi QinRyan HuntChuan Yue
  • Computer Science
    Proceedings of the ACM International Workshop on Security and Privacy Analytics - IWSPA '19
  • 2019
An adversarial training experimental framework is designed to answer two research questions and finds that MBEAT is indeed beneficial, indicating that it has some important value in practice, and that RGOAT indeed exists, indicated that adversarialTraining should be an iterative process.

Adversarial Examples: Attacks and Defenses for Deep Learning

The methods for generating adversarial examples for DNNs are summarized, a taxonomy of these methods is proposed, and three major challenges in adversarialExamples are discussed and the potential solutions are discussed.

Adversarial Logit Pairing

Improved techniques for defending against adversarial examples at scale are developed and it is shown that adversarial logit pairing achieves the state of the art defense on ImageNet against PGD white box attacks, with an accuracy improvement.

FenceBox: A Platform for Defeating Adversarial Examples with Data Augmentation Techniques

This paper presents FenceBox, a comprehensive framework to defeat various kinds of adversarial attacks, equipped with 15 data augmentation methods from three different categories, and comprehensively evaluated that these methods can effectively mitigate various adversarialattacks.



Attacking the Madry Defense Model with L1-based Adversarial Examples

The experimental results demonstrate that by relaxing the constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion.

Ensemble Adversarial Training: Attacks and Defenses

This work finds that adversarial training remains vulnerable to black-box attacks, where perturbations computed on undefended models are transferred to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step.

Adversarial Risk and the Dangers of Evaluating Against Weak Attacks

This paper motivates the use of adversarial risk as an objective, although it cannot easily be computed exactly, and frames commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarialrisk.

Adversarial examples in the physical world

It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Towards Evaluating the Robustness of Neural Networks

It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.

Explaining and Harnessing Adversarial Examples

It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

Boosting Adversarial Attacks with Momentum

A broad class of momentum-based iterative algorithms to boost adversarial attacks by integrating the momentum term into the iterative process for attacks, which can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples.

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

The authors' elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications ofDNNs.

The Limitations of Deep Learning in Adversarial Settings

This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.