1. Bounding Fixed Points of Set-Based Bellman Operator and Nash Equilibria of Stochastic Games
- Author
-
Li, Sarah H. Q., Assalé, Adjé, Garoche, Pierre-Loïc, Açıkmeşe, Behçet, University of Washington [Seattle], Université de Perpignan Via Domitia (UPVD), Ecole Nationale de l'Aviation Civile (ENAC), This work has been partially supported by Feanicses project, France ANR-17-CE25-0018 and NSF, USA CNS-1736582. The material in this paper was presented at the 21st IFAC World Congress (IFAC 2020), July 12–17, 2020, Berlin, Germany., and ANR-17-CE25-0018,FEANICSES,Analyses formelles et exhaustives de systèmes embarqués de contrôle à base de calculs intensifs(2017)
- Subjects
FOS: Computer and information sciences ,0209 industrial biotechnology ,Mathematical optimization ,Computer Science::Computer Science and Game Theory ,Computer science ,MathematicsofComputing_NUMERICALANALYSIS ,02 engineering and technology ,Interval (mathematics) ,Fixed point ,learning in games ,decision making and autonomy ,symbols.namesake ,020901 industrial engineering & automation ,Operator (computer programming) ,[INFO.INFO-FL]Computer Science [cs]/Formal Languages and Automata Theory [cs.FL] ,Computer Science - Computer Science and Game Theory ,Bellman equation ,0202 electrical engineering, electronic engineering, information engineering ,[INFO.INFO-SY]Computer Science [cs]/Systems and Control [cs.SY] ,FOS: Mathematics ,stochastic control ,multi-agent systems ,Electrical and Electronic Engineering ,Mathematics - Optimization and Control ,Stochastic control ,[INFO.INFO-GT]Computer Science [cs]/Computer Science and Game Theory [cs.GT] ,020208 electrical & electronic engineering ,Stochastic game ,Control and Systems Engineering ,Nash equilibrium ,learning theory ,Optimization and Control (math.OC) ,symbols ,Markov decision process ,Computer Science and Game Theory (cs.GT) - Abstract
Motivated by uncertain parameters encountered in Markov decision processes (MDPs) and stochastic games, we study the effect of parameter uncertainty on Bellman operator-based algorithms under a set-based framework. Specifically, we first consider a family of MDPs where the cost parameters are in a given compact set; we then define a Bellman operator acting on a set of value functions to produce a new set of value functions as the output under all possible variations in the cost parameter. We prove the existence of a fixed point of this set-based Bellman operator by showing that it is contractive on a complete metric space, and explore its relationship with the corresponding family of MDPs and stochastic games. Additionally, we show that given interval set bounded cost parameters, we can form exact bounds on the set of optimal value functions. Finally, we utilize our results to bound the value function trajectory of a player in a stochastic game., 15 pages, 4 figures
- Published
- 2020
- Full Text
- View/download PDF