# Monotone operator equilibrium networks

@article{Winston2020MonotoneOE, title={Monotone operator equilibrium networks}, author={Ezra Winston and J. Zico Kolter}, journal={ArXiv}, year={2020}, volume={abs/2006.08591} }

Implicit-depth models such as Deep Equilibrium Networks have recently been shown to match or exceed the performance of traditional deep networks while being much more memory efficient. However, these models suffer from unstable convergence to a solution and lack guarantees that a solution exists. On the other hand, Neural ODEs, another class of implicit-depth models, do guarantee existence of a unique solution but perform poorly compared with traditional networks. In this paper, we develop a…

## 34 Citations

JFB: Jacobian-Free Backpropagation for Implicit Networks

- Computer Science
- 2021

Jacobian-Free Backpropagation (JFB), a fixed-memory approach that circumvents the need to solve Jacobian-based equations, is proposed that makes implicit networks faster to train and significantly easier to implement, without sacrificing test accuracy.

Robustness Certificates for Implicit Neural Networks: A Mixed Monotone Contractive Approach

- Computer ScienceL4DC
- 2022

A theoretical and computational framework for robustness verification of implicit neural networks is proposed and a novel relative classifier variable is proposed that leads to tighter bounds on the certified adversarial robustness in classification problems.

Nonsmooth Implicit Differentiation for Machine Learning and Optimization

- Computer Science, MathematicsNeurIPS
- 2021

A nonsmooth implicit function theorem with an operational calculus is established and several applications, such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsm Smooth Lasso-type models are provided.

Optimization Induced Equilibrium Networks

- Computer ScienceArXiv
- 2021

This work decomposes DNNs into a new class of unit layer that is the proximal operator of an implicit convex function while keeping its output unchanged, and derives a equilibrium model of the unit layer, named Optimization Induced Equilibrium Networks (OptEq), which can be easily extended to deep layers.

Robust Implicit Networks via Non-Euclidean Contractions

- Computer Science, MathematicsNeurIPS
- 2021

A new framework to design well-posed and robust implicit neural networks based upon contraction theory for the non-Euclidean norm `∞, which leads to a larger polytopic training search space than existing conditions and the average iteration enjoys accelerated convergence.

Semialgebraic Representation of Monotone Deep Equilibrium Models and Applications to Certification

- Computer ScienceNeurIPS
- 2021

A semialgebraic representation for ReLU based monDEQs is introduced which allows to approximate the corresponding input output relation by semidefinite programming (SDP) and it is suggested that mon DEQs are much more robust to L2 perturbation than L∞ perturbations.

Lipschitz Bounded Equilibrium Networks

- Computer Science, MathematicsArXiv
- 2020

New parameterizations of equilibrium neural networks defined by implicit equations are introduced and well-posedness (existence of solutions) is shown under less restrictive conditions on the network weights and more natural assumptions on the activation functions: that they are monotone and slope restricted.

Improved Model Based Deep Learning Using Monotone Operator Learning (Mol)

- Computer Science2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)
- 2022

This work introduces a novel monotone operator learning framework to overcome some of the challenges associated with current unrolled frameworks, including high memory cost, lack of guarantees on robustness to perturbations, and low interpretability.

N EURAL D EEP E QUILIBRIUM S OLVERS

- Computer Science
- 2022

These experiments show that these equilibrium solvers are fast to taking an over the original DEQ’s training time), few additional parameters, to a 2 × speedup in DEQ network inference without any degradation in accuracy across numerous domains and tasks.

On Training Implicit Models

- Computer ScienceNeurIPS
- 2021

This work proposes a novel gradient estimate for implicit models, named phantom gradient, that forgoes the costly computation of the exact gradient; and provides an update direction empirically preferable to the implicit model training.

## References

SHOWING 1-10 OF 28 REFERENCES

Augmented Neural ODEs

- Computer ScienceNeurIPS
- 2019

Augmented Neural ODEs are introduced which, in addition to being more expressive models, are empirically more stable, generalize better and have a lower computational cost than Neural Odes.

Learning Multiple Layers of Features from Tiny Images

- Computer Science
- 2009

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

Neural Ordinary Differential Equations

- Computer ScienceNeurIPS
- 2018

This work shows how to scalably backpropagate through any ODE solver, without access to its internal operations, which allows end-to-end training of ODEs within larger models.

A Primer on Monotone Operator Methods

- Mathematics
- 2015

This tutorial paper presents the basic notation and results of monotone operators and operator splitting methods, with a focus on convex optimization. A very wide variety of algorithms, ranging from…

Reading Digits in Natural Images with Unsupervised Feature Learning

- Computer Science
- 2011

A new benchmark dataset for research use is introduced containing over 600,000 labeled digits cropped from Street View images, and variants of two recently proposed unsupervised feature learning methods are employed, finding that they are convincingly superior on benchmarks.

Mnist handwritten digit database

- ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist,
- 2010

Deep Equilibrium Models

- Computer ScienceNeurIPS
- 2019

It is shown that DEQs often improve performance over these state-of-the-art models (for similar parameter counts); have similar computational requirements to existing models; and vastly reduce memory consumption (often the bottleneck for training large sequence models), demonstrating an up-to 88% memory reduction in the authors' experiments.

Implicit Deep Learning

- Computer ScienceSIAM J. Math. Data Sci.
- 2021

The implicit framework greatly simplifies the notation of deep learning, and opens up many new possibilities, in terms of novel architectures and algorithms, robustness analysis and design, interpretability, sparsity, and network architecture optimization.

Contracting Implicit Recurrent Neural Networks: Stable Models with Improved Trainability

- Computer ScienceL4DC
- 2020

An implicit model structure is proposed that allows for a convex parametrization of stable models using contraction analysis of non-linear systems and a significant increase in the speed of training and model performance is observed.

Stable and expressive recurrent vision models

- Computer ScienceNeurIPS
- 2020

It is demonstrated that recurrent vision models trained with C-RBP can detect long-range spatial dependencies in a synthetic contour tracing task that BPTT-trained models cannot and outperform the leading feedforward approach.