1. Model-Free Reinforcement Learning for Stochastic Parity Games
- Author
-
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik, Konnov, Igor, Kovacs, Laura, and Formal Methods and Tools
- Subjects
Computer Science::Computer Science and Game Theory ,Theory of computation → Automata over infinite objects ,Computing methodologies → Machine learning algorithms ,Theory of computation → Convergence and learning in games ,Reinforcement learning ,ComputingMilieux_PERSONALCOMPUTING ,Stochastic games ,Mathematics of computing → Markov processes ,Omega-regular objectives - Abstract
This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 1 1/2-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions.
- Published
- 2020
- Full Text
- View/download PDF