Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Strategy complexity of lim sup and lim inf payoff objectives in countable MDPs

View/Open
MundayE_2023.pdf (954.2Kb)
Date
30/08/2023
Author
Munday, Eric
Metadata
Show full item record
Abstract
We study countably infinite Markov decision processes (MDPs) with real-valued tran- sition rewards. A strategy is a function which decides how plays proceed within the MDP. Every strategy induces a set of infinite runs in the MDP and each infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Mean payoff (the sequence of the sums of all rewards so far, divided by the number of steps), and 3. Total payoff (the sequence of the sums of all rewards so far). For each payoff type, the threshold objective is to maximise the probability that the lim sup/lim inf is non-negative. We are interested in the strategy complexity of the above objectives, i.e. the amount of memory and/or randomisation that a strategy needs access to in order to play well (optimally resp. ε-optimally). Our results seek not only to decide whether an objective requires finite or infinite memory, but in the case of infinite memory, what kind of infinite memory is necessary and sufficient. For example, a step counter which acts as a clock, or a reward counter which sums up the seen rewards may be sufficient. We compare the lim sup/lim inf point payoff objectives to the Büchi/co-Büchi ob- jectives which, given a set of states or transitions, seek to maximise the probability that this set is visited infinitely/finitely often. Convergence effects are what differen- tiate lim sup/lim inf point payoff objectives from Büchi/co-Büchi. For example, the sequence −1/2, −1/3, −1/4 . . . does satisfy lim sup ≥ 0 and lim inf ≥ 0 despite all of the rewards being negative. It is in dealing with these effects which we make our main technical contributions. We establish a complete picture of the strategy complexity for both the lim sup and lim inf point payoff objectives. In particular we show that opti- mal lim sup requires either randomisation or access to a step counter and that lim inf of point payoff requires a step counter (but not more) when the underlying MDP is infinitely branching. We also comprehensively pin down the strategy complexity for the lim inf total and mean payoff objectives. This result requires a novel counterexample involving unboundedly growing rewards as well as finely tuned transition probabilities which force the player to use memory in order to mimic what occurred in past random events. This allows us to show that both of these objectives require the use of both a step counter as well as a reward counter. We apply our results to solve two open problems from Sudderth [35] about the strategy complexity of optimal strategies for the expected lim sup/lim inf point pay- off. We achieve this by reducing each objective to its respective optimal threshold lim sup/lim inf point payoff counterpart. Thus we are able to conclude that they share the same optimal strategy complexity.
URI
https://hdl.handle.net/1842/40898

http://dx.doi.org/10.7488/era/3651
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page