Deep generative modelling for amortised variational inference
Probabilistic and statistical modelling are the fundamental frameworks that underlie a large proportion of the modern machine learning (ML) techniques. These frameworks allow for the practitioners to develop tailor-made models for their problems that may include their expert knowledge and can learn from data. Learning from data in the Bayesian framework is referred as inference. In general, model-specific inference methods are hard to derive as they require high level of mathematical and statistical dexterity on the practitioner’s part. As a result, there is a large industry of researchers in ML and statistics that work towards developing automatic methods of inference (Carpenter et al., 2017; Tran et al., 2016; Kucukelbir et al., 2016; Ge et al., 2018; Salvatier et al., 2016; Uber, 2017; Lintusaari et al., 2018). These methods are generally model agnostic and are therefore called black-box inference. Recent work has shown that use of deep learning techniques (Rezende and Mohamed, 2015b; Kingma et al., 2016; Srivastava and Sutton, 2017; Mescheder et al., 2017a) within the framework of variational inference (Jordan et al., 1999) not only allows for automatic and accurate inference but does so in a drastically efficient way. The added efficiency comes from the amortisation of the learning cost by using deep neural networks to leverage the smoothness between data points and their posterior parameters. The field of deep learning based amortised variational inference is relatively new and therefore has numerous challenges and issues to be tackled before it can be established as a standard method of inference. To this end, this thesis presents four pieces of original work in the domain of automatic amortised variational inference in statistical models. We first introduce two sets of techniques for amortising variational inference in Bayesian generative models such as the Latent Dirichlet Allocation (Blei et al., 2003) and Pachinko Allocation Machine (Li and McCallum, 2006). These techniques use deep neural networks and stochastic gradient based first order optimisers for inference and can be generically applied for inference in a large number of Bayesian generative models. Similarly, we also introduce a novel variational framework for implicit generative models of data, called VEEGAN. This framework allows for doing inference in statistical models where unlike the Bayesian generative models, a prescribed likelihood function is not available. It makes use of a discriminator based density ratio estimator (Sugiyama et al., 2012) to deal with the intractability of the likelihood function. Implicit generative models such as the generative adversarial networks (Goodfellow et al., 2014) suffer from learning issues like mode collapse (Srivastava et al., 2017) and training instability (Arjovsky et al., 2017). We tackle the mode collapse in GANs using VEEGAN and propose a new training method for implicit generative models, RB-MMDnet based on an alternative density ratio estimation which provide for stable training and optimisation in implicit models. Our results and analysis clearly show that the application of deep generative modelling in variational inference is a promising direction for improving the state of the black-box inference methods. Not only do these methods perform better than the traditional inference methods for the models in question but they do so in a fraction of the time compared to the traditional methods by utilising the latest in the GPU technology.