Generalisation in deep reinforcement learning with multiple tasks and domains

Zhao, Chenyang

Generalisation in deep reinforcement learning with multiple tasks and domains

Simple item page

dc.contributor.advisor

Hospedales, Timothy

en

dc.contributor.advisor

Storkey, Amos

en

dc.contributor.advisor

Vijayakumar, Sethu

en

dc.contributor.author

Zhao, Chenyang

en

dc.date.accessioned

2021-03-11T14:58:56Z

dc.date.available

2021-03-11T14:58:56Z

dc.date.issued

2020-11-30

dc.description.abstract

A long standing vision of robotics research is to build autonomous systems that can adapt to unforeseen environmental perturbations and learn a set of tasks progressively. Reinforcement learning (RL) has shown great success in a variety of robot control tasks because of recent advances in hardware and learning techniques. To further fulfil this long term goal, generalisation of RL arises as a demanding research topic as it allows learning agents to extract knowledge from past experience and transfer to new situations. This covers generalisation against sampling noise to avoid overfitting, generalisation against environmental changes to avoid domain shift, and generalisation over different but related tasks to achieve lifelong knowledge transfer. This thesis investigates these challenges in the context of RL, with a main focus on cross-domain and cross-task generalisation. We first address the problem of generalisation across domains. With a focus on continuous control tasks, we characterise the sources of uncertainty that may cause generalisation challenges in Deep RL, and provide a new benchmark and thorough empirical evaluation of generalisation challenges for state of the art Deep RL methods. In particular, we show that, if generalisation is the goal, then the common practice of evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice. Moreover, we evaluate several techniques for improving generalisation and draw conclusions about the most robust techniques to date. From the evaluation, we can see that learning from multiple domains improves generalisation performance across domains. However, aggregating gradient information from different domains may make learning unstable. In the second work, we propose to update the policy to minimise the sum of distances to the new policies learned in each domain in every iteration, measured by Kullback-Leibler (KL) divergence of output (action) distributions. We show that our method improves both the training asymptotic reward and testing policy robustness against domain shifts in a variety of control tasks. We finally investigate generalisation across different classes of control tasks. In particular, we introduce a class of neural network controllers that can realise four distinct tasks: reaching, object throwing, casting, and ball-in-cup. By factorising the weights of the neural network, transferable latent skills are exacted which enable acceleration of learning in cross-task transfer. With a suitable curriculum, this allows us to learn challenging dexterous control tasks like ball-in-cup from scratch with only reinforcement learning.

en

dc.identifier.uri

https://hdl.handle.net/1842/37519

dc.identifier.uri

http://dx.doi.org/10.7488/era/803

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Zhao, C., Hospedales, T. M., Stulp, F., and Sigaud, O. (2017). Tensor based knowledge transfer across skill categories for robot control. In IJCAI, pages 3462–3468.

en

dc.relation.hasversion

Zhao, C., Siguad, O., Stulp, F., and Hospedales, T. M. (2019). Investigating generalisation in continuous deep reinforcement learning. arXiv preprint arXiv:1902.07015.

en

dc.subject

learning algorithms

en

dc.subject

learning generalisation

en

dc.subject

robotics research

en

dc.subject

reinforcement learning

en

dc.subject

generalisation of RL

en

dc.subject

cross-domain generalisation

en

dc.subject

cross-task generalisation

en

dc.title

Generalisation in deep reinforcement learning with multiple tasks and domains

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhao2020.pdf
Size:: 6.25 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection