• Corpus ID: 244527636

Input Convex Gradient Networks

@article{RichterPowell2021InputCG,
  title={Input Convex Gradient Networks},
  author={Jack Richter-Powell and Jonathan Lorraine and Brandon Amos},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.12187}
}
The gradients of convex functions are expressive models of non-trivial vector fields. For example, Brenier’s theorem yields that the optimal transport map between any two measures on Euclidean space under the squared distance is realized as a convex gradient, which is a key insight used in recent generative flow models. In this paper, we study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network, which we call the Input Convex Gradient Network… 

Figures from this paper

Efficient Gradient Flows in Sliced-Wasserstein Space

It is argued that this method is more ex-ible than JKO-ICNN, since SW enjoys a closed-form differentiable approximation and can be parameterized by any generative model which alleviates the computational burden and makes it tractable in higher dimensions.

Supervised Training of Conditional Monge Maps

C OND OT is introduced, an approach to estimate OT maps conditioned on a context variable, using several pairs of measures tagged with a context label c i to infer the effect of an arbitrary combination of genetic or therapeutic perturbation on single cells, using only observations of the effects of said perturbations separately.

Neural Unbalanced Optimal Transport via Cycle-Consistent Semi-Couplings

This work introduces N UB OT, a neural unbalanced OT formulation that relies on the formalism of semi-couplings to account for creation and destruction of mass and derives an efficient parameterization based on neural optimal transport maps and proposes a novel algorithmic scheme through a cycle-consistent training procedure.

Learning Gradients of Convex Functions with Monotone Gradient Networks

This work proposes C-M GN and M-MGN, two monotone gradient neural network architectures for directly learning the gradients of convex functions, and shows that their networks are simpler to train, learn monotones more accurately, and use significantly fewer parameters than state of the art methods.

References

SHOWING 1-10 OF 31 REFERENCES

Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks

An approach that relies on the recently introduced input-convex neural networks (ICNN) to parametrize the space of convex functions in order to approximate the JKO scheme is proposed, as well as in designing functionals over measures that enjoy convergence guarantees.

Optimal transport mapping via input convex neural networks

This approach ensures that the transport mapping the authors find is optimal independent of how they initialize the neural networks, as gradient of a convex function naturally models a discontinuous transport mapping.

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

This paper introduces Convex Potential Flows (CP-Flow), a natural and efficient parameterization of invertible models inspired by the optimal transport (OT) theory, and proves that CP-Flows are universal density approximators and are optimal in the OT sense.

Input Convex Neural Networks

This paper presents the input convex neural network architecture. These are scalar-valued (potentially deep) neural networks with constraints on the network parameters such that the output of the

JacNet: Learning Functions with Structured Jacobians

This work proposes to directly learn the Jacobian of the input-output function with a neural network, which allows easy control of derivative, and focuses on structuring the derivative to allow invertibility, and also demonstrates other useful priors can be enforced.

Large-Scale Wasserstein Gradient Flows

This work introduces a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications that relies on input-convex neural networks to discretize the JKO steps, which can be optimized by stochastic gradient descent.

Sorting out Lipschitz function approximation

This work identifies a necessary property for such an architecture: each of the layers must preserve the gradient norm during backpropagation, and proposes to combine a gradient norm preserving activation function, GroupSort, with norm-constrained weight matrices that are universal Lipschitz function approximators.

An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality

Novel architectures that are guaranteed to satisfy the triangle inequality are introduced and it is shown that these architectures outperform existing metric approaches when modeling graph distances and have a better inductive bias than non-metric approaches when training data is limited in the multi-goal reinforcement learning setting.

Learning Generative Models with Sinkhorn Divergences

This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles three issues by relying on two key ideas: entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; and algorithmic (automatic) differentiation of these iterations.

Wasserstein-2 Generative Networks

This paper proposes a novel end-to-end algorithm for training generative models which uses a non-minimax objective simplifying model training and uses the approximation of Wasserstein-2 distance by Input Convex Neural Networks.