On probabilistic inference approaches to stochastic optimal control

Rawlik, Konrad Cyrus

On probabilistic inference approaches to stochastic optimal control

Simple item page

dc.contributor.advisor

Van Rossum, Mark

en

dc.contributor.advisor

Vijayakumar, Sethu

en

dc.contributor.author

Rawlik, Konrad Cyrus

en

dc.contributor.sponsor

Engineering and Physical Sciences Research Council (EPSRC)

en

dc.date.accessioned

2014-01-09T14:08:17Z

dc.date.available

2014-01-09T14:08:17Z

dc.date.issued

2013-11-28

dc.description.abstract

While stochastic optimal control, together with associate formulations like Reinforcement Learning, provides a formal approach to, amongst other, motor control, it remains computationally challenging for most practical problems. This thesis is concerned with the study of relations between stochastic optimal control and probabilistic inference. Such dualities { exempli ed by the classical Kalman Duality between the Linear-Quadratic-Gaussian control problem and the filtering problem in Linear-Gaussian dynamical systems { make it possible to exploit advances made within the separate fields. In this context, the emphasis in this work lies with utilisation of approximate inference methods for the control problem. Rather then concentrating on special cases which yield analytical inference problems, we propose a novel interpretation of stochastic optimal control in the general case in terms of minimisation of certain Kullback-Leibler divergences. Although these minimisations remain analytically intractable, we show that natural relaxations of the exact dual lead to new practical approaches. We introduce two particular general iterative methods ψ-Learning, which has global convergence guarantees and provides a unifying perspective on several previously proposed algorithms, and Posterior Policy Iteration, which allows direct application of inference methods. From these, practical algorithms for Reinforcement Learning, based on a Monte Carlo approximation to ψ-Learning, and model based stochastic optimal control, using a variational approximation of posterior policy iteration, are derived. In order to overcome the inherent limitations of parametric variational approximations, we furthermore introduce a new approach for none parametric approximate stochastic optimal control based on a reproducing kernel Hilbert space embedding of the control problem. Finally, we address the general problem of temporal optimisation, i.e., joint optimisation of controls and temporal aspects, e.g., duration, of the task. Specifically, we introduce a formulation of temporal optimisation based on a generalised form of the finite horizon problem. Importantly, we show that the generalised problem has a dual finite horizon problem of the standard form, thus bringing temporal optimisation within the reach of most commonly used algorithms. Throughout, problems from the area of motor control of robotic systems are used to evaluate the proposed methods and demonstrate their practical utility.

en

dc.identifier.uri

http://hdl.handle.net/1842/8293

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Rawlik, K. and Toussaint, M. and Vijayakumar, S. (2012). On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference. In Proc. Robotics: Science and Systems VIII.

en

dc.relation.hasversion

Nakanishi, J. and Rawlik, K. and Vijayakumar, S. (2011). Sti ness and Temporal Optimization in Periodic Movements: An Optimal Control Approach. In Proc. Int. Conf. on Intelligent Robots and Systems.

en

dc.relation.hasversion

Rawlik, K. and Toussaint, M. and Vijayakumar, S. (2010). An Approximate Inference Approach to Temporal Optimization in Optimal Control. In Proc. Advances in Neural Information Processing Systems.

en

dc.relation.hasversion

Rawlik, K. and Toussaint, M. and Vijayakumar, S.. Approximate Inference Formulations of Stochastic Optimal Control and Reinforcement Learning. Submitted to Autonomous Robots.

en

dc.subject

stochastic optimal control

en

dc.subject

probabilistic inference

en

dc.subject

Linear-Quadratic-Gaussian control problem

en

dc.subject

ψ-Learning

en

dc.subject

temporal optimisation

en

dc.title

On probabilistic inference approaches to stochastic optimal control

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: sources.zip
Size:: 3.34 MB
Format:: Unknown data format
Description:

Download

Name:: Rawlik2013.pdf
Size:: 2.66 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection