Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Advances in scalable learning and sampling of unnormalised models

View/Open
Rhodes2023.pdf (4.158Mb)
Date
10/07/2023
Author
Rhodes, Benjamin
Metadata
Show full item record
Abstract
We study probabilistic models that are known incompletely, up to an intractable normalising constant. To reap the full benefit of such models, two tasks must be solved: learning and sampling. These two tasks have been subject to decades of research, and yet significant challenges still persist. Traditional approaches often suffer from poor scalability with respect to dimensionality and model-complexity, generally rendering them inapplicable to models parameterised by deep neural networks. In this thesis, we contribute a new set of methods for addressing this scalability problem. We first explore the problem of learning unnormalised models. Our investigation begins with a well-known learning principle, Noise-contrastive Estimation, whose underlying mechanism is that of density-ratio estimation. By examining why existing density-ratio estimators scale poorly, we identify a new framework, telescoping density-ratio estimation (TRE), that can learn ratios between highly dissimilar densities in high-dimensional spaces. Our experiments demonstrate that TRE not only yields substantial improvements for the learning of deep unnormalised models, but can do the same for a broader set of tasks including mutual information estimation and representation learning. Subsequently, we explore the problem of sampling unnormalised models. A large literature on Markov chain Monte Carlo (MCMC) can be leveraged here, and in continuous domains, gradient-based samplers such as Metropolis-adjusted Langevin algorithm (MALA) and Hamiltonian Monte Carlo are excellent options. However, there has been substantially less progress in MCMC for discrete domains. To advance this subfield, we introduce several discrete Metropolis-Hastings samplers that are conceptually inspired by MALA, and demonstrate their strong empirical performance across a range of challenging sampling tasks.
URI
https://hdl.handle.net/1842/40761

http://dx.doi.org/10.7488/era/3518
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page