Generalisation in deep reinforcement learning with multiple tasks and domains
dc.contributor.advisor
Hospedales, Timothy
en
dc.contributor.advisor
Storkey, Amos
en
dc.contributor.advisor
Vijayakumar, Sethu
en
dc.contributor.author
Zhao, Chenyang
en
dc.date.accessioned
2021-03-11T14:58:56Z
dc.date.available
2021-03-11T14:58:56Z
dc.date.issued
2020-11-30
dc.description.abstract
A long standing vision of robotics research is to build autonomous systems that can
adapt to unforeseen environmental perturbations and learn a set of tasks progressively.
Reinforcement learning (RL) has shown great success in a variety of robot control
tasks because of recent advances in hardware and learning techniques. To further fulfil this long term goal, generalisation of RL arises as a demanding research topic as
it allows learning agents to extract knowledge from past experience and transfer to
new situations. This covers generalisation against sampling noise to avoid overfitting,
generalisation against environmental changes to avoid domain shift, and generalisation
over different but related tasks to achieve lifelong knowledge transfer. This thesis investigates these challenges in the context of RL, with a main focus on cross-domain
and cross-task generalisation.
We first address the problem of generalisation across domains. With a focus on
continuous control tasks, we characterise the sources of uncertainty that may cause
generalisation challenges in Deep RL, and provide a new benchmark and thorough
empirical evaluation of generalisation challenges for state of the art Deep RL methods.
In particular, we show that, if generalisation is the goal, then the common practice of
evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice. Moreover, we evaluate several techniques for improving
generalisation and draw conclusions about the most robust techniques to date.
From the evaluation, we can see that learning from multiple domains improves
generalisation performance across domains. However, aggregating gradient information from different domains may make learning unstable. In the second work, we propose to update the policy to minimise the sum of distances to the new policies learned
in each domain in every iteration, measured by Kullback-Leibler (KL) divergence of
output (action) distributions. We show that our method improves both the training
asymptotic reward and testing policy robustness against domain shifts in a variety of
control tasks.
We finally investigate generalisation across different classes of control tasks. In
particular, we introduce a class of neural network controllers that can realise four distinct tasks: reaching, object throwing, casting, and ball-in-cup. By factorising the
weights of the neural network, transferable latent skills are exacted which enable acceleration of learning in cross-task transfer. With a suitable curriculum, this allows
us to learn challenging dexterous control tasks like ball-in-cup from scratch with only
reinforcement learning.
en
dc.identifier.uri
https://hdl.handle.net/1842/37519
dc.identifier.uri
http://dx.doi.org/10.7488/era/803
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Zhao, C., Hospedales, T. M., Stulp, F., and Sigaud, O. (2017). Tensor based knowledge transfer across skill categories for robot control. In IJCAI, pages 3462–3468.
en
dc.relation.hasversion
Zhao, C., Siguad, O., Stulp, F., and Hospedales, T. M. (2019). Investigating generalisation in continuous deep reinforcement learning. arXiv preprint arXiv:1902.07015.
en
dc.subject
learning algorithms
en
dc.subject
learning generalisation
en
dc.subject
robotics research
en
dc.subject
reinforcement learning
en
dc.subject
generalisation of RL
en
dc.subject
cross-domain generalisation
en
dc.subject
cross-task generalisation
en
dc.title
Generalisation in deep reinforcement learning with multiple tasks and domains
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Zhao2020.pdf
- Size:
- 6.25 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

