# Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

@article{Li2019StochasticRA, title={Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond}, author={Xuechen Li and Denny Wu and Lester W. Mackey and Murat A. Erdogdu}, journal={ArXiv}, year={2019}, volume={abs/1906.07868} }

Sampling with Markov chain Monte Carlo methods typically amounts to discretizing some continuous-time dynamics with numerical integration. [] Key Result In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic…

## 52 Citations

### Non-convex weakly smooth Langevin Monte Carlo using regularization

- Mathematics, Computer Science
- 2021

Convexification of nonconvex domain is used in combination with regularization to prove convergence in Kullback-Leibler (KL) divergence with the number of iterations to reach ǫ-neighborhood of a target distribution in only polynomial dependence on the dimension.

### Unadjusted Langevin algorithm for non-convex weakly smooth potentials

- Computer Science, Mathematics
- 2021

A new mixture weakly smooth condition is introduced, under which it is proved that ULA for smoothing potential will converge with additional log-Sobolev inequality, and convergence guarantees under isoperimetry, and non-strongly convex at infinity are established.

### Improved Discretization Analysis for Underdamped Langevin Monte Carlo

- Computer Science
- 2023

This work provides a novel analysis of ULMC, and obtains the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.

### Convergence of Langevin Monte Carlo in Chi-Squared and Rényi Divergence

- Computer Science, MathematicsAISTATS
- 2022

This frame-work covers a range of non-convex potentials that are ﬁrst-order smooth and exhibit strong convexity outside of a compact region and recovers the state-of-the-art rates in KL divergence, total variation and 2-Wasserstein distance in the same setup.

### Riemannian Langevin Algorithm for Solving Semidefinite Programs

- Computer Science, MathematicsArXiv
- 2020

It is shown the Langevin algorithm achieves $\epsilon$-multiplicative accuracy with high probability in n-3 iterations, where $n$ is the size of the cost matrix, and provides a global optimality guarantee for the SDP and the Max-Cut problem.

### The shifted ODE method for underdamped Langevin MCMC

- Computer ScienceArXiv
- 2021

This paper considers the underdamped Langevin diffusion (ULD) and proposes a numerical approximation using its associated ordinary differential equation (ODE), and shows that the ODE approximation achieves a 2-Wasserstein error of ε in O under the standard smoothness and strong convexity assumptions on the target distribution.

### Convergence Analysis of Langevin Monte Carlo in Chi-Square Divergence

- Mathematics
- 2020

We study sampling from a target distribution ν∗ ∝ e using the unadjusted Langevin Monte Carlo (LMC) algorithm when the target ν∗ satisfies the Poincaré inequality and the potential f is weakly…

### Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincar\'e Inequality

- Mathematics
- 2023

Langevin diffusions are rapidly convergent under appropriate functional inequality assumptions. Hence, it is natural to expect that with additional smoothness conditions to handle the discretization…

### On the Ergodicity, Bias and Asymptotic Normality of Randomized Midpoint Sampling Method

- Computer Science, MathematicsNeurIPS
- 2020

This paper describes the stationary distribution of the discrete chain obtained with constant step-size discretization and shows that it is biased away from the target distribution, and establishes the asymptotic normality for numerical integration using the randomized midpoint method.

### A Dynamical System View of Langevin-Based Non-Convex Sampling

- Computer ScienceArXiv
- 2022

A new framework is developed that yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators and motivates more efficient schemes that enjoy the same rigorous guarantees.

## References

SHOWING 1-10 OF 71 REFERENCES

### On sampling from a log-concave density using kinetic Langevin diffusions

- Computer ScienceArXiv
- 2018

It is proved that the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is, in the overdamped regime, optimal in terms of its dependence on the condition number is optimal.

### Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

- Computer Science, MathematicsCOLT
- 2017

The present work provides a nonasymptotic analysis in the context of non-convex learning problems, giving finite-time guarantees for SGLD to find approximate minimizers of both empirical and population risks.

### Direct Runge-Kutta Discretization Achieves Acceleration

- Computer ScienceNeurIPS
- 2018

It is proved that under Lipschitz-gradient, convexity and order-$(s+2)$ differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of $\mathcal{O}({N^{-2\frac{s}{s+1}}})$, where $s$ is the order of the Runge-Kutta numerical integrator.

### Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

- Computer ScienceNeurIPS
- 2018

For the first time, it is proved that the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (VR-SGLD) to the almost minimizer after $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stoChastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime.

### Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2020

This work provides a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of stepsize and number of leapfrog steps, and provides a general framework for sharpening mixing time bounds Markov chains initialized at a substantial distance from the target distribution over continuous spaces.

### A Complete Recipe for Stochastic Gradient MCMC

- Computer Science, MathematicsNIPS
- 2015

This paper provides a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices, and uses the recipe to straightforwardly propose a new state-adaptive sampler: stochastics gradient Riemann Hamiltonian Monte Carlo (SGRHMC).

### User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient

- Computer ScienceStochastic Processes and their Applications
- 2019

### Underdamped Langevin MCMC: A non-asymptotic analysis

- Computer ScienceCOLT
- 2018

A MCMC algorithm based on its discretization is presented and it is shown that it achieves $\varepsilon$ error (in 2-Wasserstein distance) in $\mathcal{O}(\sqrt{d}/\varePSilon)$ steps, a significant improvement over the best known rate for overdamped Langevin MCMC.

### On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators

- Computer ScienceNIPS
- 2015

This paper considers general SG-MCMCs with high-order integrators, and develops theory to analyze finite-time convergence properties and their asymptotic invariant measures.

### Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting

- Computer ScienceArXiv
- 2018

Both overdamped and underdamped Langevin MCMC are studied and upper bounds on the number of steps required to obtain a sample from a distribution that is within $\epsilon$ of $p*$ in $1$-Wasserstein distance are established.