• Corpus ID: 228064556

A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

  title={A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance},
  author={Minhui Huang and Shiqian Ma and Lifeng Lai},
The Wasserstein distance has become increasingly important in machine learning and deep learning. Despite its popularity, the Wasserstein distance is hard to approximate because of the curse of dimensionality. A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data from the high dimensional probability distribution onto a lower-dimensional subspace, and then compute the Wasserstein distance between the projected data. However, this approach requires… 

Projection Robust Wasserstein Barycenter

This paper proposes the projection robust Wasserstein barycenter (PRWB) that mitigates the curse of dimensionality and incorporates the PRWB into a discrete distribution clustering algorithm, and the numerical results confirm that the model helps improve the clustering performance significantly.

Projection Robust Wasserstein Barycenters

This paper proposes the projection robust Wasserstein barycenter (PR WB) that has the potential to mitigate the curse of dimensionality, and a relaxed PRWB (RPRWB) model that is computationally more tractable.

Two-sample Test using Projected Wasserstein Distance

A projected Wasserstein distance is developed for the two-sample test, a fundamental problem in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution, to circumvent the curse of dimensionality.

Two-Sample Test with Kernel Projected Wasserstein Distance

We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are

A Riemannian exponential augmented Lagrangian method for computing the projection robust Wasserstein distance

A Riemannian exponential augmented Lagrangian method ( ReALM) with a global convergence guarantee to solve the computation of the PRW distance as an optimization problem over the Cartesian product of the Stiefel manifold and the Euclidean space with additional nonlinear inequality constraints.

Statistical, Robustness, and Computational Guarantees for Sliced Wasserstein Distances

This work quantifies sliced Wasserstein distances scalability from three key aspects: empirical convergence rates; robustness to data contamination; and efficient computational methods; and characterize minimax optimal, dimension-free robust estimation risks, and show an equivalence between robust 1-Wasserstein estimation and robust mean estimation.

Riemannian Hamiltonian methods for min-max optimization on manifolds

The proposed Riemannian Hamiltonian methods (RHM) are extended to include consensus regularization and to the stochastic setting and illustrate the efficacy of the proposed RHM in applications such as subspace robust Wasserstein distance, robust training of neural networks, and generative adversarial networks.

Gradient Descent Ascent for Min-Max Problems on Riemannian Manifold

This is the first study of the minimax optimization over the Riemannian manifold and it is proved that the MVR-RSGDA algorithm achieves a lower sample complexity of $\tilde{O}(\kappa^{4}\epsilon^{-3})$ without large batches, which reaches near the best known sample complexity for its Euclidean counterparts.

Detecting Incipient Fault Using Wasserstein Distance

A novel process monitoring method based on the Wasserstein distance for incipient fault detection and an algorithm called Riemannian Block Coordinate Descent (RBCD) algorithm is used to solve this model, which is fast when the number of sampled data is large.



Projection Robust Wasserstein Distance and Riemannian Optimization

A first step into a computational theory of the PRW distance is provided and the links between optimal transport and Riemannian optimization are provided.

Subspace Robust Wasserstein distances

This work proposes a "max-min" robust variant of the Wasserstein distance by considering the maximal possible distance that can be realized between two measures, assuming they can be projected orthogonally on a lower $k$-dimensional subspace.

Riemannian adaptive stochastic gradient algorithms on matrix manifolds

This work proposes novel stochastic gradient algorithms for problems on Riemannian matrix manifolds by adapting the row and column subspaces of gradients and achieves the convergence rate of order $\mathcal{O}(\log (T)/\sqrt{T})$, where $T$ is the number of iterations.

Vector Transport-Free SVRG with General Retraction for Riemannian Optimization: Complexity Analysis and Practical Implementation

A vector transport-free stochastic variance reduced gradient method with general retraction for empirical risk minimization over Riemannian manifold is proposed, and is named S-SVRG, where the first "S" means simple.

On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification

The viewpoint of projection robust (PR) OT is adopted, which seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected, and an asymptotic guarantee of two types of minimum PRW estimators and a central limit theorem for max-sliced Wasserstein estimator under model misspecification are formulated.

Proximal Gradient Method for Nonsmooth Optimization over the Stiefel Manifold

It is proved that the proposed retraction-based proximal gradient method globally converges to a stationary point and Iteration complexity for obtaining an $\epsilon$-stationary solution is analyzed.

Generalized Sliced Wasserstein Distances

The generalized Radon transform is utilized to define a new family of distances for probability measures, which are called generalized sliced-Wasserstein (GSW) distances, and it is shown that, similar to the SW distance, the GSW distance can be extended to a maximum GSW (max- GSW) distance.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Distributional Sliced-Wasserstein and Applications to Generative Modeling

This paper proposes a novel distance that finds optimal penalized probability measure over the slices, named Distributional Sliced-Wasserstein distance (DSWD), and shows that the DSWD is a generalization of both SWD and Max-SWD, and the proposed distance could be found by searching for the push-forward measure over a set of measures satisfying some certain constraints.

Sliced Wasserstein Kernels for Probability Distributions

This work provides a new perspective on the application of optimal transport flavored distances through kernel methods in machine learning tasks and provides a family of provably positive definite kernels based on the Sliced Wasserstein distance.