Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Mathematics, School of
  • Mathematics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Mathematics, School of
  • Mathematics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Iterative methods for solving stochastic optimal control problems

View/Open
Kerimkulov2022.pdf (1.234Mb)
Date
04/04/2022
Author
Kerimkulov, Bekzhan
Metadata
Show full item record
Abstract
Optimal control problems are inherently hard to solve as the optimization must be performed simultaneously with updating the underlying system. Therefore, most of the time, they have to be solved numerically. In this thesis, we consider two iterative methods for solving stochastic optimal control problems: Howard’s policy improvement algorithm and the method of successive approximations (MSA). Starting from an initial guess, Howard’s policy improvement algorithm separates the step of updating the trajectory of the dynamical system from the optimization and iterations of this should converge to the optimal control. In the discrete space-time setting this is often the case and even rates of convergence are known. In the continuous space-time setting of controlled diffusion the algorithm consists of solving a linear PDE followed by a maximization problem. This has been shown to converge; in some situations, however no global rate is known. The first main contribution is to establish global rate of convergence for the policy improvement algorithm and a variant, called here the gradient iteration algorithm. The second main contribution is the proof of stability of the algorithms under perturbations to both the accuracy of the linear PDE solution and the accuracy of the maximization step. The proof technique is new in this context as it uses the theory of backward stochastic differential equations. The classical MSA is an iterative method for solving stochastic control problems and is derived from Pontryagin’s optimality principle. It is known that the MSA may fail to converge. Using estimates for the backward stochastic differential equation we propose a modification to the MSA algorithm. This modified MSA is shown to converge for general stochastic control problems with control in both the drift and diffusion coefficients. Under some additional assumptions the rate of convergence is shown. The results are valid without restrictions on the time horizon of the control problem, in contrast to iterative methods based on the theory of forward-backward stochastic differential equations. In addition, we study the MSA for solving stochastic control problems with entropy regularization, where the action space is the space of measures. We modify the classical MSA by the relative entropy of two consecutive controls coming from the algorithm. We establish convergence of the algorithm and show how it can be applied for relaxed stochastic control problems.
URI
https://hdl.handle.net/1842/38849

http://dx.doi.org/10.7488/era/2103
Collections
  • Mathematics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page