• Corpus ID: 195069208

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

  title={Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond},
  author={Xuechen Li and Denny Wu and Lester W. Mackey and Murat A. Erdogdu},
Sampling with Markov chain Monte Carlo methods typically amounts to discretizing some continuous-time dynamics with numerical integration. [] Key Result In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic…

Figures and Tables from this paper

Non-convex weakly smooth Langevin Monte Carlo using regularization

Convexification of nonconvex domain is used in combination with regularization to prove convergence in Kullback-Leibler (KL) divergence with the number of iterations to reach ǫ-neighborhood of a target distribution in only polynomial dependence on the dimension.

Unadjusted Langevin algorithm for non-convex weakly smooth potentials

A new mixture weakly smooth condition is introduced, under which it is proved that ULA for smoothing potential will converge with additional log-Sobolev inequality, and convergence guarantees under isoperimetry, and non-strongly convex at infinity are established.

Improved Discretization Analysis for Underdamped Langevin Monte Carlo

This work provides a novel analysis of ULMC, and obtains the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.

Convergence of Langevin Monte Carlo in Chi-Squared and Rényi Divergence

This frame-work covers a range of non-convex potentials that are first-order smooth and exhibit strong convexity outside of a compact region and recovers the state-of-the-art rates in KL divergence, total variation and 2-Wasserstein distance in the same setup.

Riemannian Langevin Algorithm for Solving Semidefinite Programs

It is shown the Langevin algorithm achieves $\epsilon$-multiplicative accuracy with high probability in n-3 iterations, where $n$ is the size of the cost matrix, and provides a global optimality guarantee for the SDP and the Max-Cut problem.

The shifted ODE method for underdamped Langevin MCMC

This paper considers the underdamped Langevin diffusion (ULD) and proposes a numerical approximation using its associated ordinary differential equation (ODE), and shows that the ODE approximation achieves a 2-Wasserstein error of ε in O under the standard smoothness and strong convexity assumptions on the target distribution.

Convergence Analysis of Langevin Monte Carlo in Chi-Square Divergence

We study sampling from a target distribution ν∗ ∝ e using the unadjusted Langevin Monte Carlo (LMC) algorithm when the target ν∗ satisfies the Poincaré inequality and the potential f is weakly

Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincar\'e Inequality

Langevin diffusions are rapidly convergent under appropriate functional inequality assumptions. Hence, it is natural to expect that with additional smoothness conditions to handle the discretization

On the Ergodicity, Bias and Asymptotic Normality of Randomized Midpoint Sampling Method

This paper describes the stationary distribution of the discrete chain obtained with constant step-size discretization and shows that it is biased away from the target distribution, and establishes the asymptotic normality for numerical integration using the randomized midpoint method.

A Dynamical System View of Langevin-Based Non-Convex Sampling

A new framework is developed that yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators and motivates more efficient schemes that enjoy the same rigorous guarantees.



On sampling from a log-concave density using kinetic Langevin diffusions

It is proved that the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is, in the overdamped regime, optimal in terms of its dependence on the condition number is optimal.

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

The present work provides a nonasymptotic analysis in the context of non-convex learning problems, giving finite-time guarantees for SGLD to find approximate minimizers of both empirical and population risks.

Direct Runge-Kutta Discretization Achieves Acceleration

It is proved that under Lipschitz-gradient, convexity and order-$(s+2)$ differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of $\mathcal{O}({N^{-2\frac{s}{s+1}}})$, where $s$ is the order of the Runge-Kutta numerical integrator.

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

For the first time, it is proved that the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (VR-SGLD) to the almost minimizer after $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stoChastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime.

Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

This work provides a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of stepsize and number of leapfrog steps, and provides a general framework for sharpening mixing time bounds Markov chains initialized at a substantial distance from the target distribution over continuous spaces.

A Complete Recipe for Stochastic Gradient MCMC

This paper provides a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices, and uses the recipe to straightforwardly propose a new state-adaptive sampler: stochastics gradient Riemann Hamiltonian Monte Carlo (SGRHMC).

User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient

Underdamped Langevin MCMC: A non-asymptotic analysis

A MCMC algorithm based on its discretization is presented and it is shown that it achieves $\varepsilon$ error (in 2-Wasserstein distance) in $\mathcal{O}(\sqrt{d}/\varePSilon)$ steps, a significant improvement over the best known rate for overdamped Langevin MCMC.

On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators

This paper considers general SG-MCMCs with high-order integrators, and develops theory to analyze finite-time convergence properties and their asymptotic invariant measures.

Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting

Both overdamped and underdamped Langevin MCMC are studied and upper bounds on the number of steps required to obtain a sample from a distribution that is within $\epsilon$ of $p*$ in $1$-Wasserstein distance are established.