Edinburgh Research Archive

Dynamic Generalisation of Continuous Action Spaces in Reinforcement Learning: A Neurally Inspired Approach

dc.contributor.advisor
Willshaw, David
en
dc.contributor.advisor
Hallam, John
en
dc.contributor.author
Smith, Andrew James
en
dc.contributor.sponsor
Engineering and Physical Sciences Research Council (EPSRC)
en
dc.date.accessioned
2004-11-24T16:44:17Z
dc.date.available
2004-11-24T16:44:17Z
dc.date.issued
2002-07
dc.description
Institute for Adaptive and Neural Computation
en
dc.description
Award number: 98318242.
en
dc.description.abstract
This thesis is about the dynamic generalisation of continuous action spaces in reinforcement learning problems. The standard Reinforcement Learning (RL) account provides a principled and comprehensive means of optimising a scalar reward signal in a Markov Decision Process. However, the theory itself does not directly address the imperative issue of generalisation which naturally arises as a consequence of large or continuous state and action spaces. A current thrust of research is aimed at fusing the generalisation capabilities of supervised (and unsupervised) learning techniques with the RL theory. An example par excellence is Tesauro’s TD-Gammon. Although much effort has gone into researching ways to represent and generalise over the input space, much less attention has been paid to the action space. This thesis first considers the motivation for learning real-valued actions, and then proposes a set of key properties desirable in any candidate algorithm addressing generalisation of both input and action spaces. These properties include: Provision of adaptive and online generalisation, adherence to the standard theory with a central focus on estimating expected reward, provision for real-valued states and actions, and full support for a real-valued discounted reward signal. Of particular interest are issues pertaining to robustness in non-stationary environments, scalability, and efficiency for real-time learning in applications such as robotics. Since exploring the action space is discovered to be a potentially costly process, the system should also be flexible enough to enable maximum reuse of learned actions. A new approach is proposed which succeeds for the first time in addressing all of the key issues identified. The algorithm, which is based on the ubiquitous self-organising map, is analysed and compared with other techniques including those based on the backpropagation algorithm. The investigation uncovers some important implications of the differences between these two particular approaches with respect to RL. In particular, the distributed representation of the multi-layer perceptron is judged to be something of a double-edged sword offering more sophisticated and more scalable generalising power, but potentially causing problems in dynamic or non-equiprobable environments, and tasks involving a highly varying input-output mapping. The thesis concludes that the self-organising map can be used in conjunction with current RL theory to provide real-time dynamic representation and generalisation of continuous action spaces. The proposed model is shown to be reliable in non-stationary, unpredictable and noisy environments and judged to be unique in addressing and satisfying a number of desirable properties identified as important to a large class of RL problems.
en
dc.format.extent
23504610 bytes
en
dc.format.extent
9838903 bytes
en
dc.format.mimetype
application/postscript
en
dc.format.mimetype
application/pdf
en
dc.identifier.uri
http://hdl.handle.net/1842/634
dc.language.iso
en
dc.publisher
University of Edinburgh. College of Science and Engineering. School of Informatics.
en
dc.subject.other
Reinforcement Learning
en
dc.subject.other
Continuous Action Spaces
en
dc.subject.other
Markov Decision Process
en
dc.title
Dynamic Generalisation of Continuous Action Spaces in Reinforcement Learning: A Neurally Inspired Approach
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 2 of 2
Name:
2001-andys.ps
Size:
22.42 MB
Format:
Postscript Files
Description:
PostScript File
Name:
2001_andys.pdf
Size:
9.38 MB
Format:
Adobe Portable Document Format
Description:
PDF Format

This item appears in the following Collection(s)