• Corpus ID: 219687955

Monotone operator equilibrium networks

@article{Winston2020MonotoneOE,
  title={Monotone operator equilibrium networks},
  author={Ezra Winston and J. Zico Kolter},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.08591}
}
Implicit-depth models such as Deep Equilibrium Networks have recently been shown to match or exceed the performance of traditional deep networks while being much more memory efficient. However, these models suffer from unstable convergence to a solution and lack guarantees that a solution exists. On the other hand, Neural ODEs, another class of implicit-depth models, do guarantee existence of a unique solution but perform poorly compared with traditional networks. In this paper, we develop a… 

Figures from this paper

JFB: Jacobian-Free Backpropagation for Implicit Networks
TLDR
Jacobian-Free Backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobian-based equations, is proposed that makes implicit networks faster to train and significantly easier to implement, without sacrificing test accuracy.
Robustness Certificates for Implicit Neural Networks: A Mixed Monotone Contractive Approach
TLDR
A theoretical and computational framework for robustness verification of implicit neural networks is proposed and a novel relative classifier variable is proposed that leads to tighter bounds on the certified adversarial robustness in classification problems.
Nonsmooth Implicit Differentiation for Machine Learning and Optimization
TLDR
A nonsmooth implicit function theorem with an operational calculus is established and several applications, such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsm Smooth Lasso-type models are provided.
Optimization Induced Equilibrium Networks
TLDR
This work decomposes DNNs into a new class of unit layer that is the proximal operator of an implicit convex function while keeping its output unchanged, and derives a equilibrium model of the unit layer, named Optimization Induced Equilibrium Networks (OptEq), which can be easily extended to deep layers.
Robust Implicit Networks via Non-Euclidean Contractions
TLDR
A new framework to design well-posed and robust implicit neural networks based upon contraction theory for the non-Euclidean norm `∞, which leads to a larger polytopic training search space than existing conditions and the average iteration enjoys accelerated convergence.
Semialgebraic Representation of Monotone Deep Equilibrium Models and Applications to Certification
TLDR
A semialgebraic representation for ReLU based monDEQs is introduced which allows to approximate the corresponding input output relation by semidefinite programming (SDP) and it is suggested that mon DEQs are much more robust to L2 perturbation than L∞ perturbations.
Lipschitz Bounded Equilibrium Networks
TLDR
New parameterizations of equilibrium neural networks defined by implicit equations are introduced and well-posedness (existence of solutions) is shown under less restrictive conditions on the network weights and more natural assumptions on the activation functions: that they are monotone and slope restricted.
Improved Model Based Deep Learning Using Monotone Operator Learning (Mol)
TLDR
This work introduces a novel monotone operator learning framework to overcome some of the challenges associated with current unrolled frameworks, including high memory cost, lack of guarantees on robustness to perturbations, and low interpretability.
N EURAL D EEP E QUILIBRIUM S OLVERS
TLDR
These experiments show that these equilibrium solvers are fast to taking an over the original DEQ’s training time), few additional parameters, to a 2 × speedup in DEQ network inference without any degradation in accuracy across numerous domains and tasks.
On Training Implicit Models
TLDR
This work proposes a novel gradient estimate for implicit models, named phantom gradient, that forgoes the costly computation of the exact gradient; and provides an update direction empirically preferable to the implicit model training.
...
1
2
3
4
...

References

SHOWING 1-10 OF 28 REFERENCES
Augmented Neural ODEs
TLDR
Augmented Neural ODEs are introduced which, in addition to being more expressive models, are empirically more stable, generalize better and have a lower computational cost than Neural Odes.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Neural Ordinary Differential Equations
TLDR
This work shows how to scalably backpropagate through any ODE solver, without access to its internal operations, which allows end-to-end training of ODEs within larger models.
A Primer on Monotone Operator Methods
This tutorial paper presents the basic notation and results of monotone operators and operator splitting methods, with a focus on convex optimization. A very wide variety of algorithms, ranging from
Reading Digits in Natural Images with Unsupervised Feature Learning
TLDR
A new benchmark dataset for research use is introduced containing over 600,000 labeled digits cropped from Street View images, and variants of two recently proposed unsupervised feature learning methods are employed, finding that they are convincingly superior on benchmarks.
Mnist handwritten digit database
  • ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist,
  • 2010
Deep Equilibrium Models
TLDR
It is shown that DEQs often improve performance over these state-of-the-art models (for similar parameter counts); have similar computational requirements to existing models; and vastly reduce memory consumption (often the bottleneck for training large sequence models), demonstrating an up-to 88% memory reduction in the authors' experiments.
Implicit Deep Learning
TLDR
The implicit framework greatly simplifies the notation of deep learning, and opens up many new possibilities, in terms of novel architectures and algorithms, robustness analysis and design, interpretability, sparsity, and network architecture optimization.
Contracting Implicit Recurrent Neural Networks: Stable Models with Improved Trainability
TLDR
An implicit model structure is proposed that allows for a convex parametrization of stable models using contraction analysis of non-linear systems and a significant increase in the speed of training and model performance is observed.
Stable and expressive recurrent vision models
TLDR
It is demonstrated that recurrent vision models trained with C-RBP can detect long-range spatial dependencies in a synthetic contour tracing task that BPTT-trained models cannot and outperform the leading feedforward approach.
...
1
2
3
...