Probabilistic inference in Bayesian neural networks

Sheinkman, Alisa

Probabilistic inference in Bayesian neural networks

Files

Sheinkman2025.pdf (24.8 MB)

Date

2025-07-29

Authors

Sheinkman, Alisa

Full item page

Abstract

Despite widespread applicability and the dominant role in machine learning, neural networks remain highly non-transparent and are often regarded as black boxes due to the lack of human-understandable interpretations. Conventional deep models tend to be overconfident in predictions, provide poor uncertainty estimates and are sensitive to adversarial attacks. The Bayesian paradigm takes a step further and provides a natural framework to address these challenges by considering infinite ensembles of differently weighted neural networks. The Bayesian neural networks are capable of capturing the uncertainty, improving the accuracy and controlling the model complexity. Unfortunately, for most real-world problems, the exact probabilistic inference is unavailable, and the asymptotically faultless Markov chain Monte Carlo becomes daunting when dealing with large high-dimensional datasets and multimodal posteriors of neural networks. At the same time, faster and computationally appealing optimization-centric variational inference lacks the theoretical justification of the sampling-based methods and is known to underestimate the uncertainty of the true posterior distribution. This thesis systematically studies different aspects of variational inference, namely, theoretical foundations, challenges and means of dealing with those. Further, the practical questions arising when implementing and comparing Bayesian neural networks are addressed, and the dependency of the predictive performance on the architectural choices and the alignment between the model and the inference algorithm are analysed. Finally, this thesis contributes to the development of variational inference techniques and presents a novel kind of Bayesian neural network called a variational bow tie neural network in which we employ sparsity-promoting priors and consider the improved version of the classical coordinate ascent variational inference algorithm.

URI

https://hdl.handle.net/1842/43738
http://dx.doi.org/10.7488/era/6271

This item appears in the following Collection(s)

Mathematics thesis and dissertation collection