• Corpus ID: 231692906

Projected Statistical Methods for Distributional Data on the Real Line with the Wasserstein Metric

  title={Projected Statistical Methods for Distributional Data on the Real Line with the Wasserstein Metric},
  author={Matteo Pegoraro and Mario Beraha},
  journal={J. Mach. Learn. Res.},
We present a novel class of projected methods, to perform statistical analysis on a data set of probability distributions on the real line, with the 2-Wasserstein metric. We focus in particular on Principal Component Analysis (PCA) and regression. To define these models, we exploit a representation of the Wasserstein space closely related to its weak Riemannian structure, by mapping the data to a suitable linear space and using a metric projection operator to constrain the results in the… 

Fast PCA in 1-D Wasserstein Spaces via B-splines Representation and Metric Projection

A novel definition of Principal Component Analysis in the Wasserstein space is proposed that yields a straightforward optimization problem that is extremely fast to compute and performs similarly to the ones already proposed in the literature while retaining a much smaller computational cost.

Efficient Convex PCA with applications to Wasserstein geodesic PCA and ranked data

A numerical implementation of convex PCA when the convex set is polyhedral, and it is shown that this provides a natural approximation of Wasserstein geodesic PCA.

Spherical Autoregressive Models, With Application to Distributional and Compositional Time Series

We introduce a new class of autoregressive models for spherical time series, where the dimension of the spheres on which the observations of the time series are situated may be finite-dimensional or

Normalized Latent Measure Factor Models

This work considers a prior distribution for a collection of discrete random measures where each measure is a linear combination of a set of latent measures, interpretable as characteristic traits shared by different distributions, with positive random weights.

Autoregressive Optimal Transport Models

Series of distributions indexed by equally spaced time points are ubiquitous in applications and their analysis constitutes one of the challenges of the emerging field of distributional data



Geodesic PCA in the Wasserstein space by Convex PCA

We introduce the method of Geodesic Principal Component Analysis (GPCA) on the space of probability measures on the line, with finite second moment, endowed with the Wasserstein metric. We discuss

Wasserstein Regression*

The analysis of samples of random objects that do not lie in a vector space has found increasing attention in statistics in recent years. An important class of such object data is univariate


A framework to quantify dependence of a random vector of probabilities in terms of closeness to exchangeability, which corresponds to the maximally dependent coupling with the same marginal distributions, i.e. the comonotonic vector is devised.

On parameter estimation with the Wasserstein distance

These results cover the misspecified setting, in which the data-generating process is not assumed to be part of the family of distributions described by the model, and some difficulties arising in the numerical approximation of these estimators are discussed.

Approximate Bayesian computation with the Wasserstein distance

This work proposes to avoid the use of summaries and the ensuing loss of information by instead using the Wasserstein distance between the empirical distributions of the observed and synthetic data, and generalizes the well‐known approach of using order statistics within approximate Bayesian computation to arbitrary dimensions.

Principal geodesic analysis for the study of nonlinear statistics of shape

The method of principal geodesic analysis is developed, a generalization of principal component analysis to the manifold setting and demonstrated its use in describing the variability of medially-defined anatomical objects.

Geodesic Regression and the Theory of Least Squares on Riemannian Manifolds

  • P. Fletcher
  • Mathematics
    International Journal of Computer Vision
  • 2012
Specific examples are given for a set of synthetically generated rotation data and an application to analyzing shape changes in the corpus callosum due to age, which can be generally applied to data on any manifold.

Wasserstein autoregressive models for density time series

The proposed autoregressive model outperforms existing methods in two of the data sets, while the best empirical performance in the other two data sets is attained by existing methods based on functional transformations of the densities.

Simplicial principal component analysis for density functions in Bayes spaces

Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements

  • X. Pennec
  • Mathematics
    Journal of Mathematical Imaging and Vision
  • 2006
This paper provides a new proof of the characterization of Riemannian centers of mass and an original gradient descent algorithm to efficiently compute them and develops the notions of mean value and covariance matrix of a random element, normal law, Mahalanobis distance and χ2 law.