• Corpus ID: 4944472

NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

@article{Ciccone2018NAISNetSD,
  title={NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations},
  author={Marco Ciccone and Marco Gallieri and Jonathan Masci and Christian Osendorfer and Faustino J. Gomez},
  journal={ArXiv},
  year={2018},
  volume={abs/1804.07209}
}
This paper introduces "Non-Autonomous Input-Output Stable Network" (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. We prove that the network is globally asymptotically stable so… 

Figures from this paper

Continuous-in-Depth Neural Networks
TLDR
This work shows that neural network models can learn to represent continuous dynamical systems, with this richer structure and properties, by embedding them into higher-order numerical integration schemes, such as the Runge Kutta schemes, and introduces ContinuousNet as a continuous-in-depth generalization of ResNet architectures.
Residual networks classify inputs based on their neural transient dynamics
TLDR
This study analytically and empirical evidence that residual networks classify inputs based on the integration of the transient dynamics of the residuals and develops a new method to adjust the depth for residual networks during training.
Reinforcing Neural Network Stability with Attractor Dynamics
TLDR
This paper reinforces stability of DNNs without changing their original structure by modeling attractor dynamics of a DNN and proposes relu-max attractor network (RMAN), a light-weight module readily to be deployed on state-of-the-art ResNet-like networks.
Compressing Deep ODE-Nets using Basis Function Expansions
TLDR
This work reconsider formulations of the weights as continuous-depth functions using linear combinations of basis functions to compress the weights through a change of basis, without retraining, while maintaining near state-of-the-art performance.
Spectral Analysis and Stability of Deep Neural Dynamics
TLDR
The view of neural networks as affine parameter varying maps allows to "crack open the black box" of global neural network dynamical behavior through visualization of stationary points, regions of attraction, state-space partitioning, eigenvalue spectra, and stability properties.
ANODEV2: A Coupled Neural ODE Framework
TLDR
Results are reported showing that the coupled ODE-based framework is indeed trainable, and that it achieves higher accuracy, compared to the baseline ResNet network and the recently-proposed Neural ODE approach.
IMEXnet: A Forward Stable Deep Neural Network
TLDR
The IMEXnet is introduced, which adapts semi-implicit methods for partial differential equations to address the field of view problem while still being comparable to standard convolutions in terms of the number of parameters and computational complexity.
Dissipative Deep Neural Dynamical Systems
In this paper, we provide sufficient conditions for dissipativity and local asymptotic stability of discrete-time dynamical systems parametrized by deep neural networks. We leverage the representation
Towards nominal stability certification of deep learning-based controllers
TLDR
Conditions to verify nominal stability of a system controlled by a deep learning based controller with a special form of neural network as the controller are provided, called non-autonomous deep networks.
...
...

References

SHOWING 1-10 OF 62 REFERENCES
Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations
TLDR
It is shown that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations and established a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks.
Highway and Residual Networks learn Unrolled Iterative Estimation
TLDR
It is demonstrated that an alternative viewpoint based on unrolled iterative estimation -- a group of successive layers iteratively refine their estimates of the same features instead of computing an entirely new representation leads to the construction of Highway and Residual networks.
Multi-level Residual Networks from Dynamical Systems View
TLDR
This paper adopts the dynamical systems point of view, and analyzes the lesioning properties of ResNet both theoretically and experimentally, and proposes a novel method for accelerating ResNet training.
Stable Architectures for Deep Neural Networks
TLDR
This paper relates the exploding and vanishing gradient phenomenon to the stability of the discrete ODE and presents several strategies for stabilizing deep learning for very deep networks.
The Reversible Residual Network: Backpropagation Without Storing Activations
TLDR
The Reversible Residual Network (RevNet) is presented, a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's, therefore, the activations for most layers need not be stored in memory during backpropagation.
Maximum Principle Based Algorithms for Deep Learning
TLDR
The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms using the Pontryagin's maximum principle, demonstrating that it obtains favorable initial convergence rate per-iteration, provided Hamiltonian maximization can be efficiently carried out.
Input output stability of recurrent neural networks
TLDR
The author demonstrates how the derived criteria can be numerically evaluated with modern techniques for quadratic optimization and some of the techniques are then illustrated for the example of using a fully recurrent network for learning the dynamics of a chaotic system.
FractalNet: Ultra-Deep Neural Networks without Residuals
TLDR
In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks.
On orthogonality and learning recurrent networks with long term dependencies
TLDR
This paper proposes a weight matrix factorization and parameterization strategy through which the degree of expansivity induced during backpropagation can be controlled and finds that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.
Adaptive Computation Time for Recurrent Neural Networks
TLDR
Performance is dramatically improved and insight is provided into the structure of the data, with more computation allocated to harder-to-predict transitions, such as spaces between words and ends of sentences, which suggests that ACT or other adaptive computation methods could provide a generic method for inferring segment boundaries in sequence data.
...
...