• Corpus ID: 250408213

On the representation and learning of monotone triangular transport maps

  title={On the representation and learning of monotone triangular transport maps},
  author={Ricardo Baptista and Youssef M. Marzouk and Olivier Zahm},
: Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization… 

Figures and Tables from this paper

On minimax density estimation via measure transport

We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the

Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective

This work uses an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction and compares it with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.



Learning non-Gaussian graphical models via Hessian scores and triangular transport

An algorithm for learning the Markov structure of continuous and non-Gaussian distributions is proposed and it is shown that the algorithm recovers the graph structure even with a biased approximation to the density.

Greedy inference with structure-exploiting lazy maps

This paper proves weak convergence of the generated sequence of distributions to the posterior, and demonstrates the benefits of the framework on challenging inference problems in machine learning and differential equations, using inverse autoregressive flows and polynomial maps as examples of the underlying density estimators.

Inference via Low-Dimensional Couplings

This paper establishes a link between the Markov properties of the target measure and the existence of low-dimensional couplings, induced by transport maps that are sparse and/or decomposable, and suggests new inference methodologies for continuous non-Gaussian graphical models.

Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields

This work proposes Bayesian nonparametric inference on the transport map by modeling its components using Gaussian processes, which enables regularization and and uncertainty quantification of the map estimation, while still resulting in a closed-form and invertible posterior map.

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

This paper introduces Convex Potential Flows (CP-Flow), a natural and efficient parameterization of invertible models inspired by the optimal transport (OT) theory, and proves that CP-Flows are universal density approximators and are optimal in the OT sense.

A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms

A theorem of McCann shows that for any two absolutely continuous probability measures on R^d there exists a monotone transformation sending one probability measure to the other. A consequence of this

Bayesian inference with optimal maps

Beyond normality: Learning sparse probabilistic graphical models in the non-Gaussian setting

An algorithm to identify sparse dependence structure in continuous and non-Gaussian probability distributions, given a corresponding set of data is presented, which relies on exploiting the connection between the sparsity of the graph and theSparsity of transport maps, which deterministically couple one probability measure to another.

Sum-of-Squares Polynomial Flow

This work proposes a general framework for high-dimensional density estimation, by specifying one-dimensional transformations (equivalently conditional densities) and appropriate conditioner networks and motivates a new Sum-of-Squares (SOS) flow that is interpretable, universal, and easy to train.

Nonlinear dimension reduction for surrogate modeling using gradient information

It is shown that building a nonlinear feature map g can permit more accurate approximation of u than a linear g, for the same input data set.