Show simple item record

dc.contributor.advisorStorkey, Amos
dc.contributor.advisorMurray, Iain
dc.contributor.authorEdwards, Harrison
dc.date.accessioned2022-08-16T13:51:15Z
dc.date.available2022-08-16T13:51:15Z
dc.date.issued2022-08-16
dc.identifier.urihttps://hdl.handle.net/1842/39312
dc.identifier.urihttp://dx.doi.org/10.7488/era/2563
dc.description.abstractWith the rise of the internet, data of many varieties including: images, audio, text and video are abundant. Unfortunately for a very specific task one might have, the data for that problem is not typically abundant unless you are lucky. Typically one might have only a small amount of labelled data, or only noisy labels, or labels for a different task, or perhaps a simulator and reward function but no demonstrations, or even a simulator but no reward function at all. However, arguably no task is truly novel and so it is often possible for neural networks to benefit from the abundant data that is related to your current task. This thesis documents three methods for learning from alternative sources of supervision, an alternative to the more preferable case of simply having unlimited amounts of direct examples of your task. Firstly we show how having data from many related tasks could be described with a simple graphical model and fit using a Variational-Autoencoder - directly modelling and representing the relations amongst tasks. Secondly we investigate various forms of prediction-based intrinsic rewards for agents in a simulator with no extrinsic rewards. Thirdly we introduce a novel intrinsic reward and investigate how to best combine it with an extrinsic reward for best performance.en
dc.contributor.sponsorEngineering and Physical Sciences Research Council (EPSRC)en
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.relation.hasversionAchiam, J., Edwards, H., Amodei, D., and Abbeel, P. (2018). Variational option discovery algorithms. arXiv preprint arXiv:1807.10299en
dc.relation.hasversionBurda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., and Efros, A. A. (2018). Large-scale study of curiosity-driven learning. In International Conference on Learning Representationsen
dc.relation.hasversionBurda, Y., Edwards, H., Storkey, A., and Klimov, O. (2018). Exploration by random network distillation. In International Conference on Learning Representationsen
dc.relation.hasversionEdwards, H. and Storkey, A. (2016). Towards a neural statistician. International Conference on Learning Representationsen
dc.subjectgraphical model dataen
dc.subjectVariational-Autoencoderen
dc.subjectprediction-based intrinsic rewardsen
dc.subjectintrinsic rewarden
dc.subjectextrinsic rewarden
dc.subjectmodellingen
dc.subjectmetalearningen
dc.titleLearning from alternative sources of supervisionen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record