Scalable geometric Markov chain Monte Carlo
Item statusRestricted Access
Embargo end date31/12/2100
Markov chain Monte Carlo (MCMC) is one of the most popular statistical inference methods in machine learning. Recent work shows that a significant improvement of the statistical efficiency of MCMC on complex distributions can be achieved by exploiting geometric properties of the target distribution. This is known as geometric MCMC. However, many such methods, like Riemannian manifold Hamiltonian Monte Carlo (RMHMC), are computationally challenging to scale up to high dimensional distributions. The primary goal of this thesis is to develop novel geometric MCMC methods applicable to large-scale problems. To overcome the computational bottleneck of computing second order derivatives in geometric MCMC, I propose an adaptive MCMC algorithm using an efficient approximation based on Limited memory BFGS. I also propose a simplified variant of RMHMC that is able to work effectively on larger scale than the previous methods. Finally, I address an important limitation of geometric MCMC, namely that is only available for continuous distributions. I investigate a relaxation of discrete variables to continuous variables that allows us to apply the geometric methods. This is a new direction of MCMC research which is of potential interest to many applications. The effectiveness of the proposed methods is demonstrated on a wide range of popular models, including generalised linear models, conditional random fields (CRFs), hierarchical models and Boltzmann machines.