• Corpus ID: 195069208

# Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

@article{Li2019StochasticRA,
title={Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond},
author={Xuechen Li and Denny Wu and Lester W. Mackey and Murat A. Erdogdu},
journal={ArXiv},
year={2019},
volume={abs/1906.07868}
}
• Published 19 June 2019
• Computer Science
• ArXiv
Sampling with Markov chain Monte Carlo methods typically amounts to discretizing some continuous-time dynamics with numerical integration. [] Key Result In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic…

## Figures and Tables from this paper

• Mathematics, Computer Science
• 2021
Convexification of nonconvex domain is used in combination with regularization to prove convergence in Kullback-Leibler (KL) divergence with the number of iterations to reach ǫ-neighborhood of a target distribution in only polynomial dependence on the dimension.
• Computer Science, Mathematics
• 2021
A new mixture weakly smooth condition is introduced, under which it is proved that ULA for smoothing potential will converge with additional log-Sobolev inequality, and convergence guarantees under isoperimetry, and non-strongly convex at infinity are established.
• Computer Science
• 2023
This work provides a novel analysis of ULMC, and obtains the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.
• Computer Science, Mathematics
AISTATS
• 2022
This frame-work covers a range of non-convex potentials that are ﬁrst-order smooth and exhibit strong convexity outside of a compact region and recovers the state-of-the-art rates in KL divergence, total variation and 2-Wasserstein distance in the same setup.
• Computer Science, Mathematics
ArXiv
• 2020
It is shown the Langevin algorithm achieves $\epsilon$-multiplicative accuracy with high probability in n-3 iterations, where $n$ is the size of the cost matrix, and provides a global optimality guarantee for the SDP and the Max-Cut problem.
• Computer Science
ArXiv
• 2021
This paper considers the underdamped Langevin diffusion (ULD) and proposes a numerical approximation using its associated ordinary differential equation (ODE), and shows that the ODE approximation achieves a 2-Wasserstein error of ε in O under the standard smoothness and strong convexity assumptions on the target distribution.
• Murat A. Erdogdu
• Mathematics
• 2020
We study sampling from a target distribution ν∗ ∝ e using the unadjusted Langevin Monte Carlo (LMC) algorithm when the target ν∗ satisfies the Poincaré inequality and the potential f is weakly
• Mathematics
• 2023
Langevin diffusions are rapidly convergent under appropriate functional inequality assumptions. Hence, it is natural to expect that with additional smoothness conditions to handle the discretization
• Computer Science, Mathematics
NeurIPS
• 2020
This paper describes the stationary distribution of the discrete chain obtained with constant step-size discretization and shows that it is biased away from the target distribution, and establishes the asymptotic normality for numerical integration using the randomized midpoint method.
• Computer Science
ArXiv
• 2022
A new framework is developed that yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators and motivates more efficient schemes that enjoy the same rigorous guarantees.

## References

SHOWING 1-10 OF 71 REFERENCES

• Computer Science
ArXiv
• 2018
It is proved that the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is, in the overdamped regime, optimal in terms of its dependence on the condition number is optimal.
• Computer Science, Mathematics
COLT
• 2017
The present work provides a nonasymptotic analysis in the context of non-convex learning problems, giving finite-time guarantees for SGLD to find approximate minimizers of both empirical and population risks.
• Computer Science
NeurIPS
• 2018
It is proved that under Lipschitz-gradient, convexity and order-$(s+2)$ differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of $\mathcal{O}({N^{-2\frac{s}{s+1}}})$, where $s$ is the order of the Runge-Kutta numerical integrator.
• Computer Science
NeurIPS
• 2018
For the first time, it is proved that the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (VR-SGLD) to the almost minimizer after $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stoChastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime.
• Mathematics, Computer Science
J. Mach. Learn. Res.
• 2020
This work provides a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of stepsize and number of leapfrog steps, and provides a general framework for sharpening mixing time bounds Markov chains initialized at a substantial distance from the target distribution over continuous spaces.
• Computer Science, Mathematics
NIPS
• 2015
This paper provides a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices, and uses the recipe to straightforwardly propose a new state-adaptive sampler: stochastics gradient Riemann Hamiltonian Monte Carlo (SGRHMC).
• Computer Science
COLT
• 2018
A MCMC algorithm based on its discretization is presented and it is shown that it achieves $\varepsilon$ error (in 2-Wasserstein distance) in $\mathcal{O}(\sqrt{d}/\varePSilon)$ steps, a significant improvement over the best known rate for overdamped Langevin MCMC.
• Computer Science
NIPS
• 2015
This paper considers general SG-MCMCs with high-order integrators, and develops theory to analyze finite-time convergence properties and their asymptotic invariant measures.
• Computer Science
ArXiv
• 2018
Both overdamped and underdamped Langevin MCMC are studied and upper bounds on the number of steps required to obtain a sample from a distribution that is within $\epsilon$ of $p*$ in $1$-Wasserstein distance are established.