Center Smoothing: Certified Robustness for Networks with Structured Outputs
@inproceedings{Kumar2021CenterSC, title={Center Smoothing: Certified Robustness for Networks with Structured Outputs}, author={Aounon Kumar}, booktitle={NeurIPS}, year={2021} }
The study of provable adversarial robustness has mostly been limited to classification tasks and models with one-dimensional real-valued outputs. We extend the scope of certifiable robustness to problems with more general and structured outputs like sets, images, language, etc. We model the output space as a metric space under a distance/similarity function, such as intersection-over-union, perceptual similarity, total variation distance, etc. Such models are used in many machine learning…
4 Citations
Certifying Model Accuracy under Distribution Shifts
- Computer ScienceArXiv
- 2022
This work presents provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution, and shows provable lower bounds on the performance of models trained on so-called “unlearnable” datasets that have been poisoned to interfere with model training.
Latent Space Smoothing for Individually Fair Representations
- Computer Science
- 2021
LASSI, the first representation learning method for certifying individual fairness of high-dimensional data, is introduced and it is demonstrated that the representations obtained by LASSI can be used to solve classification tasks that were unseen during training.
L ATENT S PACE S MOOTHING FOR I NDIVIDUALLY F AIR R EPRESENTATIONS
- Computer Science
- 2022
LASSI, the first representation learning method for certifying individual fairness of high-dimensional data, is introduced and it is demonstrated that the representations obtained by LASSI can be used to solve classification tasks that were unseen during training.
Towards Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
- Computer ScienceArXiv
- 2022
This work studies and design adversarial attack on multivariate probabilistic forecasting models, taking into consideration attack budget constraints and the correlation architecture between multiple time series, and develops two defense strategies.
References
SHOWING 1-10 OF 70 REFERENCES
Robustness Certificates Against Adversarial Examples for ReLU Networks
- Computer ScienceArXiv
- 2019
This paper proposes attack-agnostic robustness certificates for a multi-label classification problem using a deep ReLU network that has a closed-form, is differentiable and is an order of magnitude faster to compute than the existing methods even for deep networks.
On the Robustness of Semantic Segmentation Models to Adversarial Attacks
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This paper presents what to their knowledge is the first rigorous evaluation of adversarial attacks on modern semantic segmentation models, using two large-scale datasets and shows how mean-field inference in deep structured models and multiscale processing naturally implement recently proposed adversarial defenses.
Improving Robustness of Deep-Learning-Based Image Reconstruction
- Computer ScienceICML
- 2020
This paper proposes to modify the training strategy of end-to-end deep-learning-based inverse problem solvers to improve robustness, and introduces an auxiliary network to generate adversarial examples, used in a min-max formulation to build robust image reconstruction networks.
Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation
- Computer ScienceAAAI
- 2020
This paper proposes an efficient and certifiably robust defense against sparse adversarial attacks by randomly ablating input features, rather than using additive noise, and empirically demonstrates that the classifier is highly robust to modern sparse adversarian attacks on MNIST.
Certified Adversarial Robustness via Randomized Smoothing
- Computer ScienceICML
- 2019
Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.
Second-Order Provable Defenses against Adversarial Attacks
- Computer ScienceICML
- 2020
This paper shows that if the eigenvalues of the Hessian of the network are bounded, the authors can compute a robustness certificate in the $l_2$ norm efficiently using convex optimization and derives a computationally-efficient differentiable upper bound on the curvature of a deep network.
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
- Computer ScienceNeurIPS
- 2019
It is demonstrated through extensive experimentation that this method consistently outperforms all existing provably $\ell-2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_ 2$-defenses.
(De)Randomized Smoothing for Certifiable Defense against Patch Attacks
- Computer ScienceNeurIPS
- 2020
A certifiable defense against patch attacks that guarantees for a given image and patch attack size, no patch adversarial examples exist, and is related to the broad class of randomized smoothing robustness schemes which provide high-confidence probabilistic robustness certificates.
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models
- Computer ScienceArXiv
- 2018
This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.
Stochastic Activation Pruning for Robust Adversarial Defense
- Computer ScienceICLR
- 2018
Stochastic Activation Pruning (SAP) is proposed, a mixed strategy for adversarial defense that prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate.