• Corpus ID: 59599987

Robustness Certificates Against Adversarial Examples for ReLU Networks

  title={Robustness Certificates Against Adversarial Examples for ReLU Networks},
  author={Sahil Singla and Soheil Feizi},
While neural networks have achieved high performance in different learning tasks, their accuracy drops significantly in the presence of small adversarial perturbations to inputs. Defenses based on regularization and adversarial training are often followed by new attacks to defeat them. In this paper, we propose attack-agnostic robustness certificates for a multi-label classification problem using a deep ReLU network. Although computing the exact distance of a given input sample to the… 

Figures and Tables from this paper

Center Smoothing: Certified Robustness for Networks with Structured Outputs

The center smoothing procedure can produce models with the guarantee that the change in the output, as measured by the distance metric, remains small for any norm-bounded adversarial perturbation of the input.

Certifying Model Accuracy under Distribution Shifts

This work presents provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution, and shows provable lower bounds on the performance of models trained on so-called “unlearnable” datasets that have been poisoned to interfere with model training.

Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

GeoCert, a novel method for computing exact pointwise robustness of deep neural networks for all convex $\ell_p$ norms, and shows that piecewise linear neural networks partition the input space into a polyhedral complex.

Robustness for Non-Parametric Classification: A Generic Attack and Defense

This work takes a holistic look at adversarial examples for non-parametric classifiers, including nearest neighbors, decision trees, and random forests, and derives an optimally robust classifier, which is analogous to the Bayes Optimal.

Center Smoothing: Provable Robustness for Functions with Metric-Space Outputs

The robustness certificates guarantee that the change in the output of the smoothed model as measured by the distance metric remains small for any norm-bounded perturbation of the input.

Attack Agnostic Detection of Adversarial Examples via Random Subspace Analysis

This work presents a technique that utilizes properties of random projections to characterize the behavior of clean and adversarial examples across a diverse set of subspaces and demonstrates that this technique outperforms competing detection strategies while remaining truly agnostic to the attack strategy.

Random Projections for Adversarial Attack Detection

This work presents a technique that makes use of special properties of random projections, whereby it can characterize the behavior of clean and adversarial examples across a diverse set of subspaces, and outperforms competing state of the art SOTA attack strategies while remaining truly agnostic to the attack method itself.

Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness

It is shown that extending the smoothing technique to defend against other attack models can be challenging, especially in the high-dimensional regime, and it is established that Gaussian smoothing provides the best possible results, up to a constant factor, when p \geq 2.

Certifying Confidence via Randomized Smoothing

It is demonstrated that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier.

Certified Robustness of Graph Convolution Networks for Graph Classification under Topological Attacks

The proposed method is based on Lagrange dualization and convex envelope, which result in tight approximation bounds that are efficiently computable by dynamic programming and allows an increased number of graphs to be certifled as robust.



Certified Defenses against Adversarial Examples

This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.

Provable defenses against adversarial examples via the convex outer adversarial polytope

A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.

Efficient Neural Network Robustness Certification with General Activation Functions

This paper introduces CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points and facilitates the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.

Evaluating Robustness of Neural Networks with Mixed Integer Programming

Verification of piecewise-linear neural networks as a mixed integer program that is able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach

This paper provides a theoretical justification for converting robustness analysis into a local Lipschitz constant estimation problem, and proposes to use the Extreme Value Theory for efficient evaluation, which yields a novel robustness metric called CLEVER, which is short for Cross LPschitz Extreme Value for nEtwork Robustness.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Towards Fast Computation of Certified Robustness for ReLU Networks

It is shown that, in fact, there is no polynomial time algorithm that can approximately find the minimum adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\ mathsf{P}$, where $n$ is the number of neurons in the network.

Certifying Some Distributional Robustness with Principled Adversarial Training

This work provides a training procedure that augments model parameter updates with worst-case perturbations of training data and efficiently certify robustness for the population loss by considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasserstein ball.

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

The authors' elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications ofDNNs.