Accelerating Certified Robustness Training via Knowledge Transfer

@article{Vaishnavi2022AcceleratingCR,
  title={Accelerating Certified Robustness Training via Knowledge Transfer},
  author={Pratik Vaishnavi and Kevin Eykholt and Amir Rahmati},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.14283}
}
Training deep neural network classifiers that are certifiably robust against adversarial attacks is critical to ensuring the security and reliability of AI-controlled systems. Although numerous state-of-the-art certified training methods have been developed, they are computationally expensive and scale poorly with respect to both dataset and network complexity. Widespread usage of certified training is further hindered by the fact that periodic retraining is necessary to incorporate new data and… 

References

SHOWING 1-10 OF 42 REFERENCES

Certified Adversarial Robustness with Additive Noise

This work establishes a connection between robustness against adversarial perturbation and additive random noise, and proposes a training strategy that can significantly improve the certified bounds.

Scalable Verified Training for Provably Robust Image Classification

This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of IMAGENET.

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

The proposed training scheme trains on convex combinations of samples along the direction of adversarial perturbation for each input, which effectively identifies over-confident, near off-class samples as a cause of limited robustness in case of smoothed classifiers, and offers an intuitive way to adaptively set a new decision boundary between these samples for better robustness.

Towards Stable and Efficient Training of Verifiably Robust Neural Networks

CROWN-IBP is computationally efficient and consistently outperforms IBP baselines on training verifiably robust neural networks, and outperform all previous linear relaxation and bound propagation based certified defenses in $\ell_\infty$ robustness.

Scaling provable adversarial defenses

This paper presents a technique for extending these training procedures to much more general networks, with skip connections and general nonlinearities, and shows how to further improve robust error through cascade models.

Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better

The effectiveness of the RSLAD approach over existing adversarial training and distillation methods in improving the robustness of small models against state-of-the-art attacks including the AutoAttack is empirically demonstrated.

A Framework for robustness Certification of Smoothed Classifiers using F-Divergences

This paper extends randomized smoothing procedures to handle arbitrary smoothing measures and prove robustness of the smoothed classifier by using $f-divergences and achieves state-of-the-art certified robustness on MNIST, CIFAR-10 and ImageNet and also audio classification task, Librispeech, with respect to several classes of adversarial perturbations.

Certified Defenses against Adversarial Examples

This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Adversarially Robust Distillation

It is found that a large amount of robustness may be inherited by the student even when distilled on only clean images, and Adversarially Robust Distillation (ARD) is introduced for distilling robustness onto student networks.