Show simple item record

dc.contributor.advisorO'Boyle, Michael
dc.contributor.advisorCrowley, Elliot
dc.contributor.advisorKomura, Taku
dc.contributor.authorTurner, Jack
dc.date.accessioned2022-08-23T09:48:21Z
dc.date.available2022-08-23T09:48:21Z
dc.date.issued2022-08-23
dc.identifier.urihttps://hdl.handle.net/1842/39326
dc.identifier.urihttp://dx.doi.org/10.7488/era/2577
dc.description.abstractImproving the e ciency of neural networks has great potential impact due to their wide range of possible use cases and their high levels of arithmetic intensity. As neural network designs evolve and hardware grows more complex, the goal of modern deep learning compilers will be to exploit opportunities for optimisation at all levels of the deployment stack; from high level choices about neural architectures all the way down to low level decisions on code generation. This thesis decomposes neural network designs into three core components: skeletons, blocks, and operations. Each component is addressed individually, and the interactions between optimisations applied at di erent layers of the deployment stack are examined. First considered are the optimisation schemes for neural network skeletons, and it is shown that the commonplace prune-and- netune pattern has a negative impact on throughput on both CPUs and GPUs. New schemes are developed for downscaling skeletons that preserve hardware performance, yield better accuracy, and avoid the expensive netuning stage. Secondly, this thesis considers optimisation for neural network blocks. A wealth of research has been dedicated to designing drop-in replacements for neural network blocks that attempt to improve their e ciency. Based on a set of simple drop-ins, this thesis develops new method for quickly deciding which replacements to put where in a network. It is shown that the algorithm developed can be used more generally to design such blocks from scratch. A core facet of the algorithm is a rejection lter which can be used to guide the kinds of networks proposed. This rejection lter can take the form of simple parameter counts, or more complex compilation metrics such as optimised inference time or levels of data reuse. This provides a potential handle for interaction between the network designer and the optimising compiler. Finally, the thesis considers network operations. Ideas are uni ed from optimising compilers and network architecture search into a single framework that allows for the generation new operations, and mutations of network architectures into highly optimised forms.en
dc.contributor.sponsorEngineering and Physical Sciences Research Council (EPSRC)en
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.relation.hasversionTurner, J., Cano, J., Radu, V., Crowley, E. J., O'Boyle, M., and Storkey, A. (2018). Characterising across stack optimisations for deep convolutional neural networks. In IEEE International Symposium on Workload Charac- terization.en
dc.relation.hasversionCrowley, E. J., Turner, J., Storkey, A., and O'Boyle, M. (2018). A Closer Look at Structured Pruning for Neural Network Compression. In Ad- vances in Neural Information Processing Systems Workshop on Compact Deep Neural Network Representation with Industrial Applications.en
dc.relation.hasversionTurner, J., Crowley, E. J., O'Boyle, M., Storkey, A., and Gray, G. (2020). BlockSwap: Fisher-guided block substitution for network compression. In In- ternational Conference on Learning Representations.en
dc.relation.hasversionTurner, J., Crowley, E.J., and O'Boyle, M. (2021) Neural Architecture Search as Program Transformation Exploration. In International Confer- ence on Architectural Support for Programming Languages and Operating Systems.en
dc.subjectSkeleton-based optimisationen
dc.subjectBlock-based optimisationen
dc.subjectOperation-based optimisationen
dc.subjectconvolutional networksen
dc.subjectZero-cost architectureen
dc.titleEfficient neural networksen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record