• Corpus ID: 59599987

Robustness Certificates Against Adversarial Examples for ReLU Networks

@article{Singla2019RobustnessCA,
title={Robustness Certificates Against Adversarial Examples for ReLU Networks},
author={Sahil Singla and Soheil Feizi},
journal={ArXiv},
year={2019},
volume={abs/1902.01235}
}
• Published 1 February 2019
• Computer Science
• ArXiv
While neural networks have achieved high performance in different learning tasks, their accuracy drops significantly in the presence of small adversarial perturbations to inputs. Defenses based on regularization and adversarial training are often followed by new attacks to defeat them. In this paper, we propose attack-agnostic robustness certificates for a multi-label classification problem using a deep ReLU network. Although computing the exact distance of a given input sample to the…
17 Citations

Figures and Tables from this paper

The center smoothing procedure can produce models with the guarantee that the change in the output, as measured by the distance metric, remains small for any norm-bounded adversarial perturbation of the input.
• Computer Science
ArXiv
• 2022
This work presents provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution, and shows provable lower bounds on the performance of models trained on so-called “unlearnable” datasets that have been poisoned to interfere with model training.
• Computer Science, Mathematics
NeurIPS
• 2019
GeoCert, a novel method for computing exact pointwise robustness of deep neural networks for all convex $\ell_p$ norms, and shows that piecewise linear neural networks partition the input space into a polyhedral complex.
• Computer Science
AISTATS
• 2020
This work takes a holistic look at adversarial examples for non-parametric classifiers, including nearest neighbors, decision trees, and random forests, and derives an optimally robust classifier, which is analogous to the Bayes Optimal.
• Computer Science, Mathematics
• 2021
The robustness certificates guarantee that the change in the output of the smoothed model as measured by the distance metric remains small for any norm-bounded perturbation of the input.
• Computer Science
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
• 2022
This work presents a technique that utilizes properties of random projections to characterize the behavior of clean and adversarial examples across a diverse set of subspaces and demonstrates that this technique outperforms competing detection strategies while remaining truly agnostic to the attack strategy.
• Computer Science
ArXiv
• 2020
This work presents a technique that makes use of special properties of random projections, whereby it can characterize the behavior of clean and adversarial examples across a diverse set of subspaces, and outperforms competing state of the art SOTA attack strategies while remaining truly agnostic to the attack method itself.
• Computer Science, Mathematics
ICML
• 2020
It is shown that extending the smoothing technique to defend against other attack models can be challenging, especially in the high-dimensional regime, and it is established that Gaussian smoothing provides the best possible results, up to a constant factor, when p \geq 2.
• Computer Science
NeurIPS
• 2020
It is demonstrated that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier.
• Computer Science
NeurIPS
• 2020
The proposed method is based on Lagrange dualization and convex envelope, which result in tight approximation bounds that are efﬁciently computable by dynamic programming and allows an increased number of graphs to be certiﬂed as robust.

References

SHOWING 1-10 OF 44 REFERENCES

• Computer Science
ICLR
• 2018
This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.
• Computer Science
ICML
• 2018
A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.
• Computer Science
NeurIPS
• 2018
This paper introduces CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points and facilitates the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.
• Computer Science
ICLR
• 2019
Verification of piecewise-linear neural networks as a mixed integer program that is able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.
• Computer Science
ArXiv
• 2018
This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.
• Computer Science
ICLR
• 2018
This paper provides a theoretical justification for converting robustness analysis into a local Lipschitz constant estimation problem, and proposes to use the Extreme Value Theory for efficient evaluation, which yields a novel robustness metric called CLEVER, which is short for Cross LPschitz Extreme Value for nEtwork Robustness.
• Computer Science
ICLR
• 2018
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
• Computer Science
ICML
• 2018
It is shown that, in fact, there is no polynomial time algorithm that can approximately find the minimum adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\ mathsf{P}$, where $n$ is the number of neurons in the network.
• Computer Science
ICLR
• 2018
This work provides a training procedure that augments model parameter updates with worst-case perturbations of training data and efficiently certify robustness for the population loss by considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasserstein ball.
• Computer Science
AAAI
• 2018
The authors' elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications ofDNNs.