Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Mathematics, School of
  • Mathematics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Mathematics, School of
  • Mathematics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Stochastic dynamics and partitioned algorithms for model parameterization in deep learning

View/Open
VlaarTJ_2022.pdf (15.71Mb)
Date
16/06/2022
Item status
Restricted Access
Embargo end date
16/06/2023
Author
Vlaar, Tiffany Joyce
Metadata
Show full item record
Abstract
In this thesis, we study model parameterization for deep learning applications. Part of the mathematical foundation for this work lies in stochastic differential equations and their constrained counterparts. We will study their role in deep learning, their properties, and their discretization. On the deep learning theory side we discuss questions around generalization error, optimization, the structure of neural network loss landscapes, and existing metrics of neural network training. Rather than aiming to exceed state-of-the-art results on benchmark datasets, our work in this area is aimed at studying and teasing out underlying properties of neural network optimization, and using those findings to obtain enhanced generalization performance. Our optimization schemes often draw inspiration from molecular dynamics and statistical physics and pave the way towards training robust and generalizable neural networks on datasets that arise in the physical sciences. The contributions of this thesis are as follows: (1) We illustrate that embedding the loss gradient in a second order Langevin dynamics framework and using low temperatures leads to more exploration, increased robustness, and —in combination with partitioned integrators— can lead to enhanced generalization performance of neural networks on certain classification tasks. (2) We provide a general framework for using constrained stochastic differential equations to train deep neural networks. Constraints provide direct control of the parameter space, which allows us to directly study their effect on generalization. A statistical guarantee on the convergence of the training is provided, along with detailed implementation schemes for specific constraints –magnitude-based and orthogonality of the weight matrix– and extensive testing. (3) We illustrate the presence of latent multiple time scales in deep learning applications and introduce the use of multirate techniques for neural network training. We analyze the convergence properties of our multirate scheme and draw a comparison with vanilla stochastic gradient descent. As main application we show that using a multirate approach we can train deep neural networks for transfer learning applications in half the time, without losing generalization performance. (4) We re-evaluate existing deep learning metrics. In particular, we study the use of the loss along the linear path between the initial and final parameters of a network as a measure of the loss landscape. We show that caution is needed when using linear interpolation to make broader claims on the shape of the landscape and success of optimization. We also find that certain neural network layers are more sensitive to the choice of initialization and optimizer hyperparameter settings, and use these observations to design custom optimization schemes.
URI
https://hdl.handle.net/1842/39125

http://dx.doi.org/10.7488/era/2376
Collections
  • Mathematics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page