Auxiliary variable Markov chain Monte Carlo methods
Files
Item Status
Embargo End Date
Date
Authors
Abstract
Markov chain Monte Carlo (MCMC) methods are a widely applicable
class of algorithms for estimating integrals in statistical inference problems.
A common approach in MCMC methods is to introduce additional
auxiliary variables into the Markov chain state and perform transitions
in the joint space of target and auxiliary variables. In this thesis we consider
novel methods for using auxiliary variables within MCMC methods
to allow approximate inference in otherwise intractable models and to
improve sampling performance in models exhibiting challenging properties
such as multimodality.
We first consider the pseudo-marginal framework. This extends the
Metropolis–Hastings algorithm to cases where we only have access to
an unbiased estimator of the density of target distribution. The resulting
chains can sometimes show ‘sticking’ behaviour where long series
of proposed updates are rejected. Further the algorithms can be difficult
to tune and it is not immediately clear how to generalise the approach
to alternative transition operators. We show that if the auxiliary variables
used in the density estimator are included in the chain state it is
possible to use new transition operators such as those based on slice-sampling
algorithms within a pseudo-marginal setting. This auxiliary
pseudo-marginal approach leads to easier to tune methods and is often
able to improve sampling efficiency over existing approaches.
As a second contribution we consider inference in probabilistic models
defined via a generative process with the probability density of the outputs
of this process only implicitly defined. The approximate Bayesian
computation (ABC) framework allows inference in such models when
conditioning on the values of observed model variables by making the
approximation that generated observed variables are ‘close’ rather than
exactly equal to observed data. Although making the inference problem
more tractable, the approximation error introduced in ABC methods can
be difficult to quantify and standard algorithms tend to perform poorly
when conditioning on high dimensional observations. This often requires
further approximation by reducing the observations to lower
dimensional summary statistics.
We show how including all of the random variables used in generating
model outputs as auxiliary variables in a Markov chain state can
allow the use of more efficient and robust MCMC methods such as slice
sampling and Hamiltonian Monte Carlo (HMC) within an ABC framework.
In some cases this can allow inference when conditioning on
the full set of observed values when standard ABC methods require reduction
to lower dimensional summaries for tractability. Further we
introduce a novel constrained HMC method for performing inference
in a restricted class of differentiable generative models which allows
conditioning the generated observed variables to be arbitrarily close to
observed data while maintaining computational tractability.
As a final topicwe consider the use of an auxiliary temperature variable
in MCMC methods to improve exploration of multimodal target densities
and allow estimation of normalising constants. Existing approaches
such as simulated tempering and annealed importance sampling use
temperature variables which take on only a discrete set of values. The
performance of these methods can be sensitive to the number and spacing
of the temperature values used, and the discrete nature of the temperature
variable prevents the use of gradient-based methods such as
HMC to update the temperature alongside the target variables. We introduce
new MCMC methods which instead use a continuous temperature
variable. This both removes the need to tune the choice of discrete
temperature values and allows the temperature variable to be updated
jointly with the target variables within a HMC method.
This item appears in the following Collection(s)

