Edinburgh Research Archive

Efficient methods and architectures for deep neural network sequence models

dc.contributor.advisor
Ramamoorthy, Subramanian
dc.contributor.advisor
Li, Zhibin
dc.contributor.author
Mbabazi, Emmanuel Kahembwe
dc.contributor.sponsor
Engineering and Physical Sciences Research Council (EPSRC)
en
dc.date.accessioned
2022-01-20T11:39:55Z
dc.date.available
2022-01-20T11:39:55Z
dc.date.issued
2021-11-30
dc.description.abstract
The recent resurgence of neural networks, termed "Deep Learning", has led to a reinvigoration of the artificial intelligence research field and all related sub-fields; from robotics and vision to natural language processing and understanding. In the last decade, this field has seen incredible breakthroughs, primarily driven by improvements to computing capability that have allowed for ever larger neural network architectures. The key driving force behind this resurgence has been the graphics processing unit (GPU) and as deep neural networks (DNNs) get ever larger, efficiency has become a bottleneck issue. Even with ample amounts of GPUs and significant financial resources, the state-of-the-art neural network models and methods are out of reach for most scientists. The significance of this challenge is brought to bare when attempting to use DNNs on video, the most consumed form of data and media. Modelling high dimensional data such as video is already computationally expensive and challenging even with small neural networks. With the 2020 Coronavirus pandemic, production and consumption of video has greatly increased as the global business population moves to working and interacting online. The low cost of video production and transmission is quickly making it the most common medium of digital communication for socially distanced humans. Video is also often the cheapest and most detailed source of information relied upon in fields such as robotics; for driverless cars, drones and teleoperated machines. As such, being able to efficiently model such data is of paramount importance to the field of AI. In this thesis, we tackle the issue of efficient modelling of complex high dimensional sequential data such as video and language. We address this problem on two fronts, computational efficiency and algorithmic efficiency. On the computational front, we propose a design methodology that significantly lowers the cost of video modelling tasks while improving performance. To enable this, we bring to bare the tools of hessian analysis in the most comprehensive analysis of generative video models to date. We then go on to tackle sequential modelling from an algorithmic efficiency perspective. We propose methods that use the temporal dynamics of sequential data to improve modelling performance post-training. We highlight the new capabilities enabled when optimization is not restricted to training scenarios and conjecture that intelligent systems should never stop training. In a collaborative effort, we propose similar approaches for natural language modelling. To conclude, we demonstrate with a single commodity GPU, that our proposed methods and architectures realise state-of-the-art results often surpassing the performance of models trained on hundreds of GPUs at significant financial cost.
en
dc.identifier.uri
https://hdl.handle.net/1842/38446
dc.identifier.uri
http://dx.doi.org/10.7488/era/1710
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.subject
deep learning
en
dc.subject
video analysis
en
dc.subject
sequence modeling
en
dc.subject
natural language processing
en
dc.subject
neural networks
en
dc.title
Efficient methods and architectures for deep neural network sequence models
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Mbabazi2021.pdf
Size:
2.65 MB
Format:
Adobe Portable Document Format
Description:

This item appears in the following Collection(s)