# A critique of the DeepSec Platform for Security Analysis of Deep Learning Models

@article{Carlini2019ACO, title={A critique of the DeepSec Platform for Security Analysis of Deep Learning Models}, author={Nicholas Carlini}, journal={ArXiv}, year={2019}, volume={abs/1905.07112} }

At IEEE SP (2) evaluates attacks and defenses using incorrect implementations that greatly under-estimate their effectiveness; (3) evaluates the robustness of each defense as an average, not based on the most effective attack against that defense; (4) performs several statistical analyses incorrectly and fails to report variance; and, (5) as a result of these errors draws invalid conclusions and makes sweeping generalizations.

#### Topics from this paper

#### 5 Citations

Benchmarking Adversarial Robustness

- Computer Science, Mathematics
- ArXiv
- 2019

A comprehensive, rigorous, and coherent benchmark to evaluate adversarial robustness on image classification tasks is established and several important findings are drawn and provide insights for future research. Expand

Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

- Computer Science
- ArXiv
- 2021

This work defines a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks, and proposes specific mitigation strategies within a systematic evaluation protocol, providing a first concrete step towards automatizing and systematizing current adversarial robustness evaluations. Expand

RobustBench: a standardized adversarial robustness benchmark

- Computer Science, Mathematics
- ArXiv
- 2020

This work evaluates robustness of models for their benchmark with AutoAttack, an ensemble of white- and black-box attacks which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications. Expand

Taking Care of The Discretization Problem: A Comprehensive Study of the Discretization Problem and A Black-Box Adversarial Attack in Discrete Integer Domain

- Computer Science, Mathematics
- 2019

This work proposes a black-box method which reduces the adversarial example searching problem to a derivative-free optimization problem and achieves significantly higher success rate in terms of adversarial examples in the discrete integer domain than recent black- box methods. Expand

Confusing and Detecting ML Adversarial Attacks with Injected Attractors through Watermarking.

- Computer Science
- 2020

A generic method is given that injects attractors from a watermark decoder into the victim model, which allows it to leverage on known watermarking schemes for scalability and robustness and provides explainability of the outcomes. Expand

#### References

SHOWING 1-10 OF 11 REFERENCES

DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model

- Computer Science
- 2019 IEEE Symposium on Security and Privacy (SP)
- 2019

This paper presents the design, implementation, and evaluation of DEEPSEC, a uniform platform that aims to bridge the gap between comprehensive evaluation on adversarial attacks and defenses and demonstrates its capabilities and advantages as a benchmark platform which can benefit future adversarial learning research. Expand

On Evaluating Adversarial Robustness

- Mathematics, Computer Science
- ArXiv
- 2019

The methodological foundations are discussed, commonly accepted best practices are reviewed, and new methods for evaluating defenses to adversarial examples are suggested. Expand

Towards Deep Learning Models Resistant to Adversarial Attacks

- Computer Science, Mathematics
- ICLR
- 2018

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee. Expand

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

- Computer Science, Mathematics
- AISec@CCS
- 2017

It is concluded that adversarialExamples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Expand

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

- Computer Science, Mathematics
- ICML
- 2018

This work identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples, and develops attack techniques to overcome this effect. Expand

The Limitations of Deep Learning in Adversarial Settings

- Computer Science, Mathematics
- 2016 IEEE European Symposium on Security and Privacy (EuroS&P)
- 2016

This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. Expand

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

- Computer Science, Mathematics
- NeurIPS
- 2018

This work proposes a novel adversarial sample detection technique for face recognition models, based on interpretability, that features a novel bi-directional correspondence inference between attributes and internal neurons to identify neurons critical for individual attributes. Expand

Explaining and Harnessing Adversarial Examples

- Computer Science, Mathematics
- ICLR
- 2015

It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Expand

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

- Mathematics
- 2016

CleverHans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more… Expand

Adversarial examples in the physical world

- Computer Science, Mathematics
- ICLR
- 2017

It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples. Expand