Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond
@article{Li2019StochasticRA, title={Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond}, author={Xuechen Li and Denny Wu and Lester W. Mackey and Murat A. Erdogdu}, journal={ArXiv}, year={2019}, volume={abs/1906.07868} }
Sampling with Markov chain Monte Carlo methods typically amounts to discretizing some continuous-time dynamics with numerical integration. [] Key Result In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic…
Figures and Tables from this paper
52 Citations
Non-convex weakly smooth Langevin Monte Carlo using regularization
- Mathematics, Computer Science
- 2021
Convexification of nonconvex domain is used in combination with regularization to prove convergence in Kullback-Leibler (KL) divergence with the number of iterations to reach ǫ-neighborhood of a target distribution in only polynomial dependence on the dimension.
Unadjusted Langevin algorithm for non-convex weakly smooth potentials
- Computer Science, Mathematics
- 2021
A new mixture weakly smooth condition is introduced, under which it is proved that ULA for smoothing potential will converge with additional log-Sobolev inequality, and convergence guarantees under isoperimetry, and non-strongly convex at infinity are established.
Improved Discretization Analysis for Underdamped Langevin Monte Carlo
- Computer Science
- 2023
This work provides a novel analysis of ULMC, and obtains the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.
Convergence of Langevin Monte Carlo in Chi-Squared and Rényi Divergence
- Computer Science, MathematicsAISTATS
- 2022
This frame-work covers a range of non-convex potentials that are first-order smooth and exhibit strong convexity outside of a compact region and recovers the state-of-the-art rates in KL divergence, total variation and 2-Wasserstein distance in the same setup.
Riemannian Langevin Algorithm for Solving Semidefinite Programs
- Computer Science, MathematicsArXiv
- 2020
It is shown the Langevin algorithm achieves $\epsilon$-multiplicative accuracy with high probability in n-3 iterations, where $n$ is the size of the cost matrix, and provides a global optimality guarantee for the SDP and the Max-Cut problem.
The shifted ODE method for underdamped Langevin MCMC
- Computer ScienceArXiv
- 2021
This paper considers the underdamped Langevin diffusion (ULD) and proposes a numerical approximation using its associated ordinary differential equation (ODE), and shows that the ODE approximation achieves a 2-Wasserstein error of ε in O under the standard smoothness and strong convexity assumptions on the target distribution.
Convergence Analysis of Langevin Monte Carlo in Chi-Square Divergence
- Mathematics
- 2020
We study sampling from a target distribution ν∗ ∝ e using the unadjusted Langevin Monte Carlo (LMC) algorithm when the target ν∗ satisfies the Poincaré inequality and the potential f is weakly…
Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincar\'e Inequality
- Mathematics
- 2023
Langevin diffusions are rapidly convergent under appropriate functional inequality assumptions. Hence, it is natural to expect that with additional smoothness conditions to handle the discretization…
On the Ergodicity, Bias and Asymptotic Normality of Randomized Midpoint Sampling Method
- Computer Science, MathematicsNeurIPS
- 2020
This paper describes the stationary distribution of the discrete chain obtained with constant step-size discretization and shows that it is biased away from the target distribution, and establishes the asymptotic normality for numerical integration using the randomized midpoint method.
A Dynamical System View of Langevin-Based Non-Convex Sampling
- Computer ScienceArXiv
- 2022
A new framework is developed that yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators and motivates more efficient schemes that enjoy the same rigorous guarantees.
References
SHOWING 1-10 OF 71 REFERENCES
On sampling from a log-concave density using kinetic Langevin diffusions
- Computer ScienceArXiv
- 2018
It is proved that the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is, in the overdamped regime, optimal in terms of its dependence on the condition number is optimal.
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
- Computer Science, MathematicsCOLT
- 2017
The present work provides a nonasymptotic analysis in the context of non-convex learning problems, giving finite-time guarantees for SGLD to find approximate minimizers of both empirical and population risks.
Direct Runge-Kutta Discretization Achieves Acceleration
- Computer ScienceNeurIPS
- 2018
It is proved that under Lipschitz-gradient, convexity and order-$(s+2)$ differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of $\mathcal{O}({N^{-2\frac{s}{s+1}}})$, where $s$ is the order of the Runge-Kutta numerical integrator.
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
- Computer ScienceNeurIPS
- 2018
For the first time, it is proved that the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (VR-SGLD) to the almost minimizer after $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stoChastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime.
Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients
- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2020
This work provides a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of stepsize and number of leapfrog steps, and provides a general framework for sharpening mixing time bounds Markov chains initialized at a substantial distance from the target distribution over continuous spaces.
A Complete Recipe for Stochastic Gradient MCMC
- Computer Science, MathematicsNIPS
- 2015
This paper provides a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices, and uses the recipe to straightforwardly propose a new state-adaptive sampler: stochastics gradient Riemann Hamiltonian Monte Carlo (SGRHMC).
User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
- Computer ScienceStochastic Processes and their Applications
- 2019
Underdamped Langevin MCMC: A non-asymptotic analysis
- Computer ScienceCOLT
- 2018
A MCMC algorithm based on its discretization is presented and it is shown that it achieves $\varepsilon$ error (in 2-Wasserstein distance) in $\mathcal{O}(\sqrt{d}/\varePSilon)$ steps, a significant improvement over the best known rate for overdamped Langevin MCMC.
On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
- Computer ScienceNIPS
- 2015
This paper considers general SG-MCMCs with high-order integrators, and develops theory to analyze finite-time convergence properties and their asymptotic invariant measures.
Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting
- Computer ScienceArXiv
- 2018
Both overdamped and underdamped Langevin MCMC are studied and upper bounds on the number of steps required to obtain a sample from a distribution that is within $\epsilon$ of $p*$ in $1$-Wasserstein distance are established.