Corpus ID: 1791473

OptNet: Differentiable Optimization as a Layer in Neural Networks

@inproceedings{Amos2017OptNetDO,
  title={OptNet: Differentiable Optimization as a Layer in Neural Networks},
  author={Brandon Amos and J. Zico Kolter},
  booktitle={ICML},
  year={2017}
}
This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel… Expand
Differentiable Convex Optimization Layers
TLDR
This paper introduces disciplined parametrized programming, a subset of disciplined convex programming, and demonstrates how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. Expand
DIFFERENTIABLE OPTIMIZATION OF GENERALIZED NONDECOMPOSABLE FUNCTIONS
  • 2020
We propose a framework which makes it feasible to directly train deep neural networks with respect to popular families of task-specific non-decomposable performance measures such as AUC, multi-classExpand
Convex optimization with an interpolation-based projection and its application to deep learning
TLDR
This paper proposes an interpolation-based projection that is computationally cheap and easy to compute given a convex, domain defining, function and proposes an optimization algorithm that follows the gradient of the composition of the objective and the projection and proves its convergence for linear objectives and arbitrary convex and Lipschitz domain defining inequality constraints. Expand
Physarum Powered Differentiable Linear Programming Layers and Applications
TLDR
This work proposes an efficient and differentiable solver for general linear programming problems which can be used in a plug and play manner within deep neural networks as a layer and can easily serve as layers whenever a learning procedure needs a fast approximate solution to a LP, within a larger network. Expand
Differentiable Fixed-Point Iteration Layer
TLDR
It is shown that the derivative of an FPI layer depends only on the fixed point, and then a method to calculate it efficiently using another FPI which is called the backward FPI is presented. Expand
CNNS THROUGH DIFFERENTIABLE PDE LAYER
Recent studies at the intersection of physics and deep learning have illustrated successes in the application of deep neural networks to partially or fully replace costly physics simulations.Expand
Differentiable Learning of Submodular Models
Can we incorporate discrete optimization algorithms within modern machine learning models? For example, is it possible to use in deep architectures a layer whose output is the minimal cut of aExpand
Linear Inequality Constraints for Neural Network Activations
TLDR
This work proposes a method to impose homogeneous linear inequality constraints of the form $Ax\leq 0$ on neural network activations and experimentally demonstrates the proposed method by constraining a variational autoencoder. Expand
Homogeneous Linear Inequality Constraints for Neural Network Activations
We propose a method to impose homogeneous linear inequality constraints of the form Ax ≤ 0 on neural network activations. The proposed method allows a data-driven training approach to be combinedExpand
Learning for Integer-Constrained Optimization through Neural Networks with Limited Training
TLDR
A symmetric and decomposed neural network structure is introduced, which is fully interpretable in terms of the functionality of its constituent components and offers superior generalization performance with limited training, as compared to other generic neural network structures that do not exploit the inherent structure of the integer constraint. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 42 REFERENCES
Input Convex Neural Networks
This paper presents the input convex neural network architecture. These are scalar-valued (potentially deep) neural networks with constraints on the network parameters such that the output of theExpand
Adam: A Method for Stochastic Optimization
TLDR
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand
Conditional Random Fields as Recurrent Neural Networks
TLDR
A new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced, and top results are obtained on the challenging Pascal VOC 2012 segmentation benchmark. Expand
Learning Deep Structured Models
TLDR
This paper proposes a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials and demonstrates the effectiveness of this algorithm in the tasks of predicting words from noisy images, as well as tagging of Flickr photographs. Expand
On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization
TLDR
Some results on differentiating argmin and argmax optimization problems with and without constraints are collected and some insightful motivating examples are provided. Expand
On solving constrained optimization problems with neural networks: a penalty method approach
TLDR
The canonical nonlinear programming circuit is shown to be a gradient system that seeks to minimize an unconstrained energy function that can be viewed as a penalty method approximation of the original problem. Expand
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and aExpand
Generic Methods for Optimization-Based Modeling
TLDR
Experimental results on denoising and image labeling problems show that learning with truncated optimization greatly reduces computational expense compared to “full” fitting. Expand
End-to-End Learning for Structured Prediction Energy Networks
TLDR
End-to-end learning for SPENs is presented, where the energy function is discriminatively trained by back-propagating through gradient-based prediction, and the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016). Expand
A Bilevel Optimization Approach for Parameter Learning in Variational Models
TLDR
This work considers a class of image denoising models incorporating $\ell_p$-norm--based analysis priors using a fixed set of linear operators and devise semismooth Newton methods for solving the resulting nonsmooth bilevel optimization problems. Expand
...
1
2
3
4
5
...