On the Robustness to Adversarial Examples of Neural ODE Image Classifiers

@article{Carrara2019OnTR,
  title={On the Robustness to Adversarial Examples of Neural ODE Image Classifiers},
  author={Fabio Carrara and Roberto Caldelli and F. Falchi and Giuseppe Amato},
  journal={2019 IEEE International Workshop on Information Forensics and Security (WIFS)},
  year={2019},
  pages={1-6}
}
  • F. Carrara, R. Caldelli, G. Amato
  • Published 1 December 2019
  • Computer Science
  • 2019 IEEE International Workshop on Information Forensics and Security (WIFS)
The vulnerability of deep neural networks to adversarial attacks currently represents one of the most challenging open problems in the deep learning field. The NeurIPS 2018 work that obtained the best paper award proposed a new paradigm for defining deep neural networks with continuous internal activations. In this kind of networks, dubbed Neural ODE Networks, a continuous hidden state can be defined via parametric ordinary differential equations, and its dynamics can be adjusted to build… 

Figures and Tables from this paper

Improving Neural ODEs via Knowledge Distillation
TLDR
The experimental results show that the new training manner can improve the classification accuracy of Neural ODEs by 24% on CIFAR10 and 5% on SVHN and quantitatively discuss the effect of both knowledge distillation and time horizon in Neural Odes on robustness against adversarial examples.
Standalone Neural ODEs with Sensitivity Analysis
This paper presents the Standalone Neural ODE (sNODE), a continuous-depth neural ODE model capable of describing a full deep neural network. This uses a novel nonlinear conjugate gradient (NCG)
Meta-Solver for Neural Ordinary Differential Equations
TLDR
It is shown that the model robustness can be further improved by optimizing solver choice for a given task, and the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
Reachability Analysis of a General Class of Neural Ordinary Differential Equations
TLDR
This paper considers a general class of neural ODEs with varying architectures and layers, and introduces a novel reachability framework that allows for the formal analysis of their behavior, and implemented in a new tool called NNVODE.

References

SHOWING 1-10 OF 35 REFERENCES
Detecting Adversarial Samples from Artifacts
TLDR
This paper investigates model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model, and results show a method for implicit adversarial detection that is oblivious to the attack algorithm.
Exploring the space of adversarial images
  • Pedro Tabacof, E. Valle
  • Computer Science
    2016 International Joint Conference on Neural Networks (IJCNN)
  • 2016
TLDR
This work formalizes the problem of adversarial images given a pretrained classifier, showing that even in the linear case the resulting optimization problem is nonconvex and that a shallow classifier seems more robust to adversarial pictures than a deep convolutional network.
Adversarial Examples Detection in Features Distance Spaces
TLDR
A method for detecting adversarial images which focuses on the trajectory of internal representations from the very first, up to the last, and defines a distance-based embedding of features to efficiently encode this information.
Adversarial image detection in deep neural networks
TLDR
This paper investigates the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers, and shows that hidden layers activations can be used to reveal incorrect classifications caused by adversarial attacks.
Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics
  • Xin Li, Fuxin Li
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
TLDR
After detecting adversarial examples, it is shown that many of them can be recovered by simply performing a small average filter on the image, which should lead to more insights about the classification mechanisms in deep convolutional neural networks.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
The Limitations of Deep Learning in Adversarial Settings
TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
On Detecting Adversarial Perturbations
TLDR
It is shown empirically that adversarial perturbations can be detected surprisingly well even though they are quasi-imperceptible to humans.
DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks
TLDR
The DeepFool algorithm is proposed to efficiently compute perturbations that fool deep networks, and thus reliably quantify the robustness of these classifiers, and outperforms recent methods in the task of computing adversarial perturbation and making classifiers more robust.
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
TLDR
The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs.
...
...