Kinetic Langevin Monte Carlo methods

Whalley, Peter Archibald

Kinetic Langevin Monte Carlo methods

Files

Whalley2024.pdf (3.19 MB)

Date

2024-09-17

Authors

Whalley, Peter Archibald

Full item page

Abstract

In this thesis, we study discretizations of kinetic Langevin dynamics within the context of Markov chain Monte Carlo. We compare the convergence properties for different choices of integrators, we provide asymptotic bias estimates and numerics to compare them. We also present an alternative to Metropolis-Hastings which corrects for the bias. This thesis consists of two parts. In the first part, we provide a framework to analyze the convergence of discretized kinetic Langevin dynamics for M-∇ Lipschitz, m-convex potentials with and without stochastic gradients. Our approach gives convergence rates of O(m/M), with explicit stepsize restrictions, which are of the same order as the stability threshold for Gaussian targets and are valid for a large interval of the friction parameter. We apply this methodology to various integration schemes which are popular in the molecular dynamics and machine learning communities. Further, we introduce the property ``γ-limit convergence" (GLC) to characterize underdamped Langevin schemes that converge to overdamped dynamics in the high-friction limit and which have stepsize restrictions that are independent of the friction parameter; we show that this property is not generic by exhibiting methods from both the class and its complement. We present numerical experiments for a Bayesian logistic regression example, where BAOAB is shown to perform the best. Finally, we provide asymptotic bias estimates for the BAOAB scheme, which remain accurate in the high-friction limit by comparison to a modified stochastic dynamics which preserves the invariant measure. In the second part, we present an unbiased method for Bayesian posterior means based on kinetic Langevin dynamics that combines advanced splitting methods with enhanced gradient approximations. Our approach avoids Metropolis correction by coupling Markov chains at different discretization levels in a multilevel Monte Carlo approach. Theoretical analysis demonstrates that our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem. It can achieve accuracy ɛ > 0 for estimating expectations of Lipschitz functions in $d$ dimensions with O(d¹/⁴ɛ⁻²) expected gradient evaluations, without assuming warm start. We exhibit similar bounds using both approximate and stochastic gradients, and our method's computational cost is shown to scale independently of the size of the dataset. The proposed method is tested using a multinomial regression problem on the MNIST dataset and a Poisson regression model for soccer scores. Experiments indicate that the number of gradient evaluations per effective sample is independent of dimension, even when using inexact gradients. For product distributions, we give dimension-independent variance bounds. Our results demonstrate that the unbiased algorithm we present can be much more efficient than the ``gold-standard" randomized Hamiltonian Monte Carlo.

URI

https://hdl.handle.net/1842/42177
http://dx.doi.org/10.7488/era/4898

This item appears in the following Collection(s)

Mathematics thesis and dissertation collection