Iterative methods for solving stochastic optimal control problems
dc.contributor.advisor
Siska, David
dc.contributor.advisor
Ottobre, Michela
dc.contributor.author
Kerimkulov, Bekzhan
dc.date.accessioned
2022-04-11T09:39:29Z
dc.date.available
2022-04-11T09:39:29Z
dc.date.issued
2022-04-04
dc.description.abstract
Optimal control problems are inherently hard to solve as the optimization must be
performed simultaneously with updating the underlying system. Therefore, most
of the time, they have to be solved numerically. In this thesis, we consider two
iterative methods for solving stochastic optimal control problems: Howard’s policy improvement algorithm and the method of successive approximations (MSA).
Starting from an initial guess, Howard’s policy improvement algorithm separates
the step of updating the trajectory of the dynamical system from the optimization and iterations of this should converge to the optimal control. In the discrete
space-time setting this is often the case and even rates of convergence are known.
In the continuous space-time setting of controlled diffusion the algorithm consists of solving a linear PDE followed by a maximization problem. This has been
shown to converge; in some situations, however no global rate is known. The first
main contribution is to establish global rate of convergence for the policy improvement algorithm and a variant, called here the gradient iteration algorithm.
The second main contribution is the proof of stability of the algorithms under
perturbations to both the accuracy of the linear PDE solution and the accuracy
of the maximization step. The proof technique is new in this context as it uses
the theory of backward stochastic differential equations.
The classical MSA is an iterative method for solving stochastic control problems and is derived from Pontryagin’s optimality principle. It is known that the
MSA may fail to converge. Using estimates for the backward stochastic differential equation we propose a modification to the MSA algorithm. This modified
MSA is shown to converge for general stochastic control problems with control
in both the drift and diffusion coefficients. Under some additional assumptions
the rate of convergence is shown. The results are valid without restrictions on
the time horizon of the control problem, in contrast to iterative methods based
on the theory of forward-backward stochastic differential equations. In addition,
we study the MSA for solving stochastic control problems with entropy regularization, where the action space is the space of measures. We modify the classical
MSA by the relative entropy of two consecutive controls coming from the algorithm. We establish convergence of the algorithm and show how it can be applied
for relaxed stochastic control problems.
en
dc.identifier.uri
https://hdl.handle.net/1842/38849
dc.identifier.uri
http://dx.doi.org/10.7488/era/2103
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
B. Kerimkulov, D. Siˇska, and L. Szpruch, “Exponential convergence and ˇ stability of Howard’s policy improvement algorithm for controlled diffusions,” SIAM Journal on Control and Optimization, vol. 58, no. 3, pp. 1314–1340, 2020
en
dc.relation.hasversion
B. Kerimkulov, D. Siˇska, and L. Szpruch, “A modified MSA for stochas- ˇ tic control problems,” Applied Mathematics and Optimization, vol. 84, p. 3417–3436, 2021
en
dc.subject
stochastic control problems
en
dc.subject
Howard’s policy
en
dc.subject
method of successive approximations
en
dc.subject
Pontryagin’s optimality principle
en
dc.subject
backward stochastic differential equation
en
dc.subject
MSA algorithm
en
dc.title
Iterative methods for solving stochastic optimal control problems
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Kerimkulov2022.pdf
- Size:
- 1.23 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

