376 results on '"Zero-Sum games"'
Search Results
2. PATH-DEPENDENT ZERO-SUM DETERMINISTIC GAMES WITH INTERMEDIATE HAMILTONIANS.
- Author
-
Hernández-Hernández, Daniel and Kaise, Hidehiro
- Subjects
ZERO sum games ,PARTIAL differential equations ,VISCOSITY solutions ,DIFFERENTIAL games ,EQUATIONS - Abstract
Motivated by D. Hernández-Hernández and M. Sîrbu (2018), we consider path-dependent Isaacs partial differential equations (PDEs) of first order with intermediate Hamiltonians given by convex combinations of lower and upper Hamiltonians. We propose discrete-time approximations which converge to a unique viscosity solutions of the intermediate Isaacs PDEs. Furthermore, we give discrete-time stochastic dynamic game representations for the approximations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. An empirical Bayes approach for estimating skill models for professional darts players.
- Author
-
Haugh, Martin B. and Wang, Chun
- Subjects
ZERO sum games ,RULES of games ,DATA analysis ,PROFESSIONAL employees - Abstract
We perform an exploratory data analysis on a data-set for the top 16 professional darts players from the 2019 season. We use this data-set to fit player skill models which can then be used in dynamic zero-sum games (ZSGs) that model real-world matches between players. We propose an empirical Bayesian approach based on the Dirichlet-Multinomial (DM) model that overcomes limitations in the data. Specifically we introduce two DM-based skill models where the first model borrows strength from other darts players and the second model borrows strength from other regions of the dartboard. We find these DM-based models outperform simpler benchmark models with respect to Brier and Spherical scores, both of which are proper scoring rules. We also show in ZSGs settings that the difference between DM-based skill models and the simpler benchmark models is practically significant. Finally, we use our DM-based model to analyze specific situations that arose in real-world darts matches during the 2019 season. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. The Price of History-Independent Strategies in Games with Inter-Temporal Externalities.
- Author
-
Tsodikovich, Yevgeny, Venel, Xavier, and Zseleva, Anna
- Abstract
In this paper, we compare the value of zero-sum stochastic games under optimal strategies (that are, for single-controller stochastic games, stationary) to the commonly used time-independent strategies ("static strategies"). Our findings are summarized in a series of theorems which provide the lower bound on the optimality of the static strategy under different assumptions. These bounds can be used to assess whether the additional computational complexity is worth the extra payoff gain or, symmetrically, assess the price of playing sub-optimal but simple strategies when stationary ones are forbidden. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Strategy investments in zero-sum games.
- Author
-
Garcia, Raul, Hosseinian, Seyedmohammadhossein, Pai, Mallesh, and Schaefer, Andrew J.
- Abstract
We propose an extension of two-player zero-sum games, where one player may select available actions for themselves and the opponent, subject to a budget constraint. We present a mixed-integer linear programming (MILP) formulation for the problem, provide analytical results regarding its solution, and discuss applications in the security and advertising domains. Our computational experiments demonstrate that heuristic approaches, on average, yield suboptimal solutions with at least a 20% relative gap with those obtained by the MILP formulation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. The Reactive Synthesis Competition (SYNTCOMP): 2018–2021.
- Author
-
Jacobs, Swen, Pérez, Guillermo A., Abraham, Remco, Bruyère, Véronique, Cadilhac, Michaël, Colange, Maximilien, Delfosse, Charly, van Dijk, Tom, Duret-Lutz, Alexandre, Faymonville, Peter, Finkbeiner, Bernd, Khalimov, Ayrat, Klein, Felix, Luttenberger, Michael, Meyer, Klara, Michaud, Thibaud, Pommellet, Adrien, Renkin, Florian, Schlehuber-Caissier, Philipp, and Sakr, Mouhammad
- Subjects
- *
ZERO sum games , *LOGIC , *MEMORY , *LIBRARIES - Abstract
We report on the last four editions of the reactive synthesis competition (SYNTCOMP 2018–2021). We briefly describe the evaluation scheme and the experimental setup of SYNTCOMP. Then we introduce new benchmark classes that have been added to the SYNTCOMP library and give an overview of the participants of SYNTCOMP. Finally, we present and analyze the results of our experimental evaluations, including a ranking of tools with respect to quantity and quality—that is, the total size in terms of logic and memory elements—of solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Unbeatable strategies.
- Author
-
Amir, Rabah, Evstigneev, Igor V., and Potapova, Valeriya
- Subjects
ZERO sum games ,GAME theory - Abstract
The paper analyzes the notion of an unbeatable strategy as a game-theoretic solution concept. A general framework (games with relative preferences) suitable for the analysis of this concept is proposed. Basic properties of unbeatable strategies are presented and a number of examples and applications considered. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Eigenvalue Methods for Sparse Tropical Polynomial Systems
- Author
-
Akian, Marianne, Béreau, Antoine, Gaubert, Stéphane, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Buzzard, Kevin, editor, Dickenstein, Alicia, editor, Eick, Bettina, editor, Leykin, Anton, editor, and Ren, Yue, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Crisis management in English local government: the limits of resilience
- Author
-
Arrieta, Tania and Davies, Jonathan S.
- Published
- 2024
- Full Text
- View/download PDF
10. A survey of decision making in adversarial games.
- Author
-
Li, Xiuxian, Meng, Min, Hong, Yiguang, and Chen, Jie
- Abstract
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and national defense, players often have adversarial stances, i.e., the selfish actions of each player inevitably or intentionally inflict loss or wreak havoc on other players. Therefore, adversarial games are important in real-world applications. However, only special adversarial games, such as Bayesian games, are reviewed in the literature. In this respect, this study aims to provide a systematic survey of three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games, and zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal strategy-seeking techniques, prevailing algorithms, and practical applications. Finally, promising future research directions are also discussed for relevant adversarial games. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Adaptive critic control with multi‐step policy evaluation for nonlinear zero‐sum games.
- Author
-
Li, Xin, Wang, Ding, Wang, Jiangyu, and Qiao, Junfei
- Subjects
- *
ZERO sum games , *ADAPTIVE control systems , *COST functions , *LINEAR systems , *DISCRETE-time systems - Abstract
To attenuate the effect of disturbances on control performance, a multi‐step adaptive critic control (MsACC) framework is developed to solve zero‐sum games for discrete‐time nonlinear systems. The MsACC algorithm utilizes multi‐step policy evaluation to obtain the solution of the Hamilton–Jacobi–Isaac equation, which is faster than that of the one‐step policy evaluation. The convergence rate of the MsACC algorithm is adjustable by varying the step size of the policy evaluation. In addition, the stability and convergence of the MsACC algorithm are proved under certain conditions. In order to realize the MsACC algorithm, three neural networks are established to approximate the control input, the disturbance input, and the cost function, respectively. Finally, the effectiveness of the MsACC algorithm is verified by two simulation examples, including a linear system and a nonlinear plant. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Non-Markovian Impulse Control Under Nonlinear Expectation.
- Author
-
Perninge, Magnus
- Subjects
- *
ZERO sum games , *STOCHASTIC differential equations , *DIFFERENTIAL games , *DYNAMIC programming , *PROBABILITY measures - Abstract
We consider a general type of non-Markovian impulse control problems under adverse non-linear expectation or, more specifically, the zero-sum game problem where the adversary player decides the probability measure. We show that the upper and lower value functions satisfy a dynamic programming principle (DPP). We first prove the dynamic programming principle (DPP) for a truncated version of the upper value function in a straightforward manner. Relying on a uniform convergence argument then enables us to show the DPP for the general setting. Following this, we use an approximation based on a combination of truncation and discretization to show that the upper and lower value functions coincide, thus establishing that the game has a value and that the DPP holds for the lower value function as well. Finally, we show that the DPP admits a unique solution and give conditions under which a saddle point for the game exists. As an example, we consider a stochastic differential game (SDG) of impulse versus classical control of path-dependent stochastic differential equations (SDEs). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Bargaining and Game Theory
- Author
-
Gaffal, Margit, Padilla Gálvez, Jesús, Gaffal, Margit, and Padilla Gálvez, Jesús
- Published
- 2023
- Full Text
- View/download PDF
14. Strategy Determination for Multiple USVs: A Min-max Q-learning Approach
- Author
-
Hong, Le, Cui, Weicheng, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhang, Haijun, editor, Ke, Yinggen, editor, Wu, Zhou, editor, Hao, Tianyong, editor, Zhang, Zhao, editor, Meng, Weizhi, editor, and Mu, Yuanyuan, editor
- Published
- 2023
- Full Text
- View/download PDF
15. The Regularity of the Value Function of Repeated Games with Switching Costs.
- Author
-
Tsodikovich, Yevgeny, Venel, Xavier, and Zseleva, Anna
- Subjects
SWITCHING costs ,ZERO sum games ,VALUATION of real property ,GAMES - Abstract
We study repeated zero-sum games where one of the players pays a certain cost each time he changes his action. We derive the properties of the value and optimal strategies as a function of the ratio between the switching costs and the stage payoffs. In particular, the strategies exhibit a robustness property and typically do not change with a small perturbation of this ratio. Our analysis extends partially to the case where the players are limited to simpler strategies that are history independent―namely, static strategies. In this case, we also characterize the (minimax) value and the strategies for obtaining it. Funding: The project leading to this publication has received funding from the French government under the "France 2030" investment plan managed by the French National Research Agency [Grant ANR-17-EURE-0020] and from the Excellence Initiative of Aix-Marseille University–A*MIDEX. Y. Tsodikovich was supported in part by the Israel Science Foundation [Grants 2566/20, 1626/18, and 448/22]. X. Venel acknowledges the financial support of the French National Research Agency through Project CIGNE [ANR-15-CE38-0007-01]. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Dichotomy value iteration with parallel learning design towards discrete-time zero-sum games.
- Author
-
Wang, Jiangyu, Wang, Ding, Li, Xin, and Qiao, Junfei
- Subjects
- *
ZERO sum games , *REINFORCEMENT learning , *COST functions , *DISCRETE-time systems , *NONLINEAR systems - Abstract
In this paper, a novel parallel learning framework is developed to solve zero-sum games for discrete-time nonlinear systems. Briefly, the purpose of this study is to determine a tentative function according to the prior knowledge of the value iteration (VI) algorithm. The learning process of the parallel controllers can be guided by the tentative function. That is to say, the neighborhood of the optimal cost function can be compressed within a small range via two typical exploration policies. Based on the parallel learning framework, a novel dichotomy VI algorithm is established to accelerate the learning speed. It is shown that the parallel controllers will converge to the optimal policy from contrary initial policies. Finally, two typical systems are used to demonstrate the learning performance of the constructed dichotomy VI algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. Finite‐horizon Q‐learning for discrete‐time zero‐sum games with application to H∞$$ {H}_{\infty } $$ control.
- Author
-
Liu, Mingxiang, Cai, Qianqian, Meng, Wei, Li, Dandan, and Fu, Minyue
- Subjects
ZERO sum games ,RICCATI equation ,REINFORCEMENT learning ,SYSTEM dynamics ,MODEL airplanes - Abstract
In this paper, we investigate the optimal control strategies for model‐free zero‐sum games involving the H∞$$ {H}_{\infty } $$ control. The key contribution is the development of a Q‐learning algorithm for linear quadratic games without knowing the system dynamics. The finite‐horizon setting is more practical than the infinite‐horizon setting, but it is difficult to solve the time‐varying Riccati equation associated with the finite‐horizon setting directly. The proposed algorithm is shown to solve the time‐varying Riccati equation iteratively without the use of models, and numerical experiments on aircraft dynamics demonstrate the algorithm's efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. A Survey on Decomposition of Finite Strategic-Form Games
- Author
-
Hao, Yaqi, Zhang, Ji-Feng, Kacprzyk, Janusz, Series Editor, Shi, Peng, editor, and Stefanovski, Jovan, editor
- Published
- 2022
- Full Text
- View/download PDF
19. Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics.
- Author
-
Lin, Mingduo, Zhao, Bo, and Liu, Derong
- Subjects
- *
ZERO sum games , *DYNAMIC programming , *NONLINEAR programming , *ITERATIVE learning control , *WEIGHT training , *ADAPTIVE fuzzy control , *REINFORCEMENT learning , *ELECTRONIC data processing - Abstract
A novel policy gradient (PG) adaptive dynamic programming method is developed to deal with nonlinear discrete-time zero-sum games with unknown dynamics. To facilitate the implementation, a policy iteration algorithm is established to approximate the iterative Q-function, as well as the control and disturbance policies via three neural network (NN) approximators, respectively. Then, the iterative Q-function is exploited to update the control and disturbance policies via PG method. To stabilize the training process and improve the data usage efficiency, the experience replay technique is applied to train the weight vectors of the three NNs by using mini-batch empirical data from replay memory. Furthermore, the convergence in terms of the iterative Q-function is proved. Simulation results of two numerical examples are provided to show the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the [formula omitted] control.
- Author
-
Liu, Mingxiang, Cai, Qianqian, Li, Dandan, Meng, Wei, and Fu, Minyue
- Subjects
- *
ZERO sum games , *REINFORCEMENT learning , *LINEAR control systems , *STATE feedback (Feedback control systems) , *RICCATI equation , *VECTOR valued functions , *HORIZON - Abstract
In this paper, we present a Q-learning framework for solving finite-horizon zero-sum game problems involving the H ∞ control of linear system without knowing the dynamics. Research in the past mainly focused on solving problems in infinite horizon with completely measurable state. However, in the practical engineering, the system state is not always directly accessible, and it is difficult to solve the time-varying Riccati equation associated with the finite-horizon setting directly either. The main contribution of the proposed model-free algorithm is to determine the optimal output feedback policies without measurement state in finite-horizon setting. To achieve this goal, we first describe the Q-function caused by finite-horizon problems in the context of state feedback, then we parameterize the Q-functions as input–output vectors functions. Finally, the numerical examples on aircraft dynamics demonstrate the algorithm's efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. On max-min solutions of fuzzy games with nonlinear memberships functions.
- Author
-
ÇEVİKEL, Adem Cengiz
- Subjects
- *
MEMBERSHIP functions (Fuzzy logic) , *NONLINEAR functions , *ZERO sum games , *GAMES , *FUZZY sets - Abstract
In this paper, we deal with two-person zero-sum games with fuzzy goals. We investigated the cases where the membership functions of the players are nonlinear. We examined how the solutions should be if the membership functions of players were exponential functions. In case players' membership functions are exponential, we developed a new method for the maximin solution according to a degree of attainment of the fuzzy goals. An application was made to show the effectiveness of the method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. Fast and Simple Solutions of Blotto Games.
- Author
-
Behnezhad, Soheil, Dehghani, Sina, Derakhshan, Mahsa, Hajiaghayi, Mohammedtaghi, and Seddighin, Saeed
- Abstract
The Colonel Blotto game (initially introduced by Borel in 1921) is commonly used for analyzing a wide range of applications from the U.S.Ppresidential election to innovative technology competitions to advertising, sports, and politics. After around a century Ahmadinejad et al. provided the first polynomial-time algorithm for computing the Nash equilibria in Colonel Blotto games. However, their algorithm consists of an exponential-size LP solved by the ellipsoid method, which is highly impractical. In "Fast and Simple Solutions of Blotto Games," Behnezhad, Dehghani, Derakhshan, Hajighayi, and Seddighin provide the first polynomial-size LP formulation of the optimal strategies for the Colonel Blotto game using linear extension techniques. They use this polynomial-size LP to provide a simpler and significantly faster algorithm for finding optimal strategies of the Colonel Blotto game. They further show this representation is asymptotically tight, which means there exists no other linear representation of the strategy space with fewer constraints. In the Colonel Blotto game, which was initially introduced by Borel in 1921, two colonels simultaneously distribute their troops across different battlefields. The winner of each battlefield is determined independently by a winner-takes-all rule. The ultimate payoff for each colonel is the number of battlefields won. The Colonel Blotto game is commonly used for analyzing a wide range of applications from the U.S. Presidential election to innovative technology competitions to advertising, sports, and politics. There are persistent efforts to find the optimal strategies for the Colonel Blotto game. However, the first polynomial-time algorithm for that has very recently been provided by Ahmadinejad, Dehghani, Hajiaghayi, Lucier, Mahini, and Seddighin. Their algorithm consists of an exponential size linear program (LP), which they solve using the ellipsoid method. Because of the use of the ellipsoid method, despite its significant theoretical importance, this algorithm is highly impractical. In general, even the simplex method (despite its exponential running time in practice) performs better than the ellipsoid method in practice. In this paper, we provide the first polynomial-size LP formulation of the optimal strategies for the Colonel Blotto game using linear extension techniques. Roughly speaking, we consider the natural representation of the strategy space polytope and transform it to a higher dimensional strategy space, which interestingly has exponentially fewer facets. In other words, we add a few variables to the LP such that, surprisingly, the number of constraints drops down to a polynomial. We use this polynomial-size LP to provide a simpler and significantly faster algorithm for finding optimal strategies of the Colonel Blotto game. We further show this representation is asymptotically tight, which means there exists no other linear representation of the strategy space with fewer constraints. We also extend our approach to multidimensional Colonel Blotto games, in which players may have different sorts of budgets, such as money, time, human resources, etc. By implementing this algorithm, we are able to run tests that were previously impossible to solve in a reasonable time. This information allows us to observe some interesting properties of Colonel Blotto; for example, we find out the behavior of players in the discrete model is very similar to the continuous model Roberson solved. Funding: This work was supported in part by NSF CAREER award CCF-1053605, NSF BIGDATA [Grant IIS-1546108], NSF AF:Medium [Grant CCF-1161365], DARPA GRAPHS/AFOSR [Grant FA9550-12-1-0423], and another DARPA SIMPLEX grant. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Synergetic learning structure-based neuro-optimal fault tolerant control for unknown nonlinear systems.
- Author
-
Xia, Hongbing, Zhao, Bo, and Guo, Ping
- Subjects
- *
ADAPTIVE control systems , *FAULT-tolerant computing , *NONLINEAR systems , *ADAPTIVE fuzzy control , *ZERO sum games , *RADIAL basis functions , *SYSTEM failures , *DIFFERENTIAL games - Abstract
In this paper, a synergetic learning structure-based neuro-optimal fault tolerant control (SLSNOFTC) method is proposed for unknown nonlinear continuous-time systems with actuator failures. Under the framework of the synergetic learning structure (SLS), the optimal control input and the actuator failure are viewed as two subsystems. Then, the fault tolerant control (FTC) problem can be regarded as a two-player zero-sum differential game according to the game theory. A radial basis function neural network-based identifier, which uses the measured input/output data, is constructed to identify the completely unknown system dynamics. To develop the SLSNOFTC method, the Hamilton–Jacobi–Isaacs equation is solved by an asymptotically stable critic neural network (ASCNN) which is composed of cooperative adaptive tuning laws. Besides, with the help of the Lyapunov stability analysis, the identification error, the weight error of ASCNN, and all signals of closed-loop system are guaranteed to be converged to zero asymptotically, rather than uniformly ultimately bounded. Numerical simulation examples further verify the effectiveness and reliability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Continuous Patrolling Games.
- Author
-
Alpern, Steve, Bui, Thuy, Lidbetter, Thomas, and Papadaki, Katerina
- Subjects
ZERO sum games ,METRIC spaces ,ARC length ,GAMES ,BORDER crossing - Abstract
Continuous Patrolling Sometimes it is necessary to have a Patroller (on foot, in a car, or maybe a drone) move around a network so as to prevent an intruder (the Attacker) from infiltrating or otherwise ruining the network operation. The "attack" could be, for example, removing a painting from the Louvre, crossing a border, or planting a bomb. The first possibility could take place at only a discrete set of points on the network, say, nodes. However, the last two types of attack could take place anywhere. The latter continuous problem has been modelled as a game by Steve Alpern, Thuy Bui, Thomas Lidbetter, and Katerina Papadaki in the article "Continuous Patrolling Games." The Attacker decides when and where to attack (the duration of the attack is specified by the problem), whereas the Patroller chooses a unit speed path, possibly periodic. If the Patroller passes the attacked point while the attack is going on, he wins the game, as the attack is thwarted. Otherwise the attack is successful, and the Attacker wins the game. The authors determine optimal strategies for both players for many classes of networks and find good strategies that work on any network. These ideas could be adapted to real-life patrolling problems on networks. We study a patrolling game played on a network Q, considered as a metric space. The Attacker chooses a point of Q (not necessarily a node) to attack during a chosen time interval of fixed duration. The Patroller chooses a unit speed path on Q and intercepts the attack (and wins) if she visits the attacked point during the attack-time interval. This zero-sum game models the problem of protecting roads or pipelines from an adversarial attack. The payoff to the maximizing Patroller is the probability that the attack is intercepted. Our results include the following: (i) a solution to the game for any network Q, as long as the time required to carry out the attack is sufficiently short; (ii) a solution to the game for all tree networks that satisfy a certain condition on their extremities; and (iii) a solution to the game for any attack duration for stars with one long arc and the remaining arcs equal in length. We present a conjecture on the solution of the game for arbitrary trees and establish it in certain cases. Funding: Financial support from the National Science Foundation [Grant CMMI-1935826] is gratefully acknowledged. Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.2346. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Model-free adaptive optimal control policy for Markov jump systems: A value iterations algorithm.
- Author
-
Zhou, Peixin, Wen, Jiwei, Swain, Akshya Kumar, and Luan, Xiaoli
- Abstract
This article develops a model-free adaptive optimal control policy for discrete-time Markov jump systems. First, a two-player zero-sum game is formulated to obtain an optimal control policy that minimizes a cost function against the worst-case disturbance. Second, an action and mode-dependent value function is set up for zero-sum game to search such a policy with convergence guarantee rather than solving an optimization problem satisfying coupled algebraic Riccati equations. To be specific, motivated by the Bellman optimal principle, we develop an online value iterations algorithm to solve the zero-sum game, which is learning while controlling without any initial stabilizing policy. By this algorithm, we can achieve disturbance attenuation for Markov jump systems without knowledge of the system matrices. The adaptivity to slowly changing uncertainties can also be achieved due to the model-free feature and policy convergence. Finally, the effectiveness and practical potential of the algorithm are demonstrated by considering two numerical examples and a solar boiler system. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. The Polymatrix Gap Conjecture.
- Author
-
Naumov, Pavel and Simonelli, Italo
- Subjects
ZERO sum games ,NASH equilibrium ,STRATEGY games ,LOGICAL prediction ,EQUILIBRIUM - Abstract
This paper proposes a novel way to compare classes of strategic games based on their sets of pure Nash equilibria. This approach is then used to relate the classes of zero-sum games, polymatrix, and k-polymatrix games. This paper concludes with a conjecture that k-polymatrix games form an increasing chain of classes. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Fictitious Play in Markov Games with Single Controller.
- Author
-
Sayin, Muhammed O., Zhang, Kaiqing, and Ozdaglar, Asuman
- Subjects
MARKOV processes ,NASH equilibrium ,GAME theory ,ECONOMICS ,REINFORCEMENT learning - Abstract
Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have thefictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [42] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. On the effect of clock offsets and quantization on learning-based adversarial games.
- Author
-
Fotiadis, Filippos, Kanellopoulos, Aris, Vamvoudakis, Kyriakos G., and Hugues, Jerome
- Subjects
- *
LIPSCHITZ continuity , *ZERO sum games , *REINFORCEMENT learning , *ITERATIVE learning control - Abstract
In this work, we consider systems whose components suffer from clock offsets and quantization and study the effect of those on a reinforcement learning (RL) algorithm. Specifically, we consider an off-policy iterative RL algorithm for continuous-time systems, which uses input and state data to approximate the Nash-equilibrium of a zero-sum game. However, the data used by this algorithm are not consistent with one another, in that each of them originates from a slightly different time instant of the past, hence putting the convergence of the algorithm in question. We prove that, given that these timing inconsistencies remain below a certain threshold, the iterative off-policy RL algorithm will still converge epsilon-closely to the desired Nash policy. However, this result is conditional to a certain Lipschitz continuity and differentiability condition on the input-state data collected, which is indispensable in the presence of clock offsets. A similar result is also derived when quantization of the measured state is considered. Finally, unlike prior work, we provide a sufficiently rich data condition for the execution of the iterative RL algorithm, which can be verified a priori across all iteration indices. Simulations are performed, which verify and clarify theoretical findings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Penalty Method for Games of Constraints
- Author
-
Konnov, Igor, Başar, Tamer, Series Editor, Acemoglu, Daron, Editorial Board Member, Bernhard, Pierre, Editorial Board Member, Falcone, Maurizio, Editorial Board Member, Kurzhanski, Alexander, Editorial Board Member, Rubinstein, Ariel, Editorial Board Member, Sandholm, William H., Editorial Board Member, Shoham, Yoav, Editorial Board Member, Zaccour, Georges, Editorial Board Member, Petrosyan, Leon A., editor, Mazalov, Vladimir V., editor, and Zenkevich, Nikolay A., editor
- Published
- 2020
- Full Text
- View/download PDF
30. Using One-Sided Partially Observable Stochastic Games for Solving Zero-Sum Security Games with Sequential Attacks
- Author
-
Tomášek, Petr, Bošanský, Branislav, Nguyen, Thanh H., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Zhu, Quanyan, editor, Baras, John S., editor, Poovendran, Radha, editor, and Chen, Juntao, editor
- Published
- 2020
- Full Text
- View/download PDF
31. MASAGE: Model-Agnostic Sequential and Adaptive Game Estimation
- Author
-
Pan, Yunian, Peng, Guanze, Chen, Juntao, Zhu, Quanyan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Zhu, Quanyan, editor, Baras, John S., editor, Poovendran, Radha, editor, and Chen, Juntao, editor
- Published
- 2020
- Full Text
- View/download PDF
32. Adversarial Search and Game Theory
- Author
-
Chowdhary, K. R. and Chowdhary, K.R.
- Published
- 2020
- Full Text
- View/download PDF
33. ROBUST MEAN FIELD LINEAR QUADRATIC SOCIAL CONTROL: OPEN-LOOP AND CLOSED-LOOP STRATEGIES.
- Author
-
YONG LIANG, BING-CHANG WANG, and HUANSHUI ZHANG
- Subjects
- *
QUADRATIC fields , *ZERO sum games , *SOCIAL control , *MEAN field theory , *RICCATI equation , *STOCHASTIC differential equations - Abstract
This paper investigates the robust social optimum problem for linear quadratic mean field control systems by the direct approach, where model uncertainty appears in both drift and diffusion terms of each agent. We take a zero-sum game approach by considering local disturbance as the control of an adversarial player. Under centralized information structure, we first obtain the necessary and sufficient condition for the existence of open-loop and closed-loop saddle points, which are characterized by the solvability of forward-backward stochastic differential equations (FBSDEs) and two coupled Riccati equations, respectively. By considering the infinite system, we next design a set of decentralized open-loop strategies based on mean field FBSDEs and obtain closed-loop strategies in terms of two uncoupled Riccati equations. Finally, the performance of the proposed decentralized strategies is analyzed and the efficiency is verified by numerical simulation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Event-triggered adaptive integral reinforcement learning method for zero-sum differential games of nonlinear systems with incomplete known dynamics.
- Author
-
Liu, Pengda, Zhang, Huaguang, Sun, Jiayue, and Tan, Zilong
- Subjects
- *
ZERO sum games , *NONLINEAR systems , *DIFFERENTIAL games , *INTEGRALS , *DYNAMIC programming - Abstract
This paper designs a novel event-based adaptive learning method for solving zero-sum games (ZSGs) of nonlinear systems with incomplete known dynamics. Firstly, a discounted cost is introduced for the system with nonzero equilibrium point to obtain the near-optimal strategy pair. Then, the employment of integral reinforcement learning (IRL) makes it unnecessary to acquire the model of the drift dynamics. To approximate the solution of the Hamilton-Jacobi-Isaacs equations (HJIEs), single-critic network is constructed with the modified tuning law utilizing preprocessed data. For purpose of increasing algorithm efficiency, the event-triggered mechanism (ETM) is introduced which could obviate Zeno behavior. Furthermore, the state and critic weight vector error are proved to be uniform ultimate bounded (UUB) through Lyapunov approach. Finally, the effectiveness of the proposed method is validated by conducting a simulation experiment. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Sıfır Toplamlı Oyun ve Markowitz'in Ortalama - Varyans Modeline Göre Portföy Seçimine İlişkin Karşılaştırmalı Bir Analiz: BIST 100 Örneği.
- Author
-
Kapucu, Hakan and Çalık, Hilal
- Abstract
Copyright of Turkish Studies - Economics, Finance, Politics is the property of Electronic Turkish Studies and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
36. Model-Free Inverse H-Infinity Control for Imitation Learning
- Author
-
Lian, Bosen, Xue, Wenqian, Kartal, Yusuf, Fan, Jialu, Chai, Tianyou, Lewis, Frank L., Lian, Bosen, Xue, Wenqian, Kartal, Yusuf, Fan, Jialu, Chai, Tianyou, and Lewis, Frank L.
- Abstract
This paper proposes a data-driven model-free inverse reinforcement learning (IRL) algorithm tailored for solving an inverse H-infinity control problem. In the problem, both an expert and a learner engage in H-infinity control to reject disturbances and the learner's objective is to imitate the expert's behavior by reconstructing the expert's performance function through IRL techniques. Introducing zero-sum game principles, we first formulate a model-based single-loop IRL policy iteration algorithm that includes three key steps: updating the policy, action, and performance function using a new correction formula and the standard inverse optimal control principles. Building upon the model-based approach, we propose a model-free single-loop off-policy IRL algorithm that eliminates the need for initial stabilizing policies and prior knowledge of the dynamics of expert and learner. Also, we provide rigorous proof of convergence, stability, and Nash optimality to guarantee the effectiveness and reliability of the proposed algorithms. Furthermore, we show-case the efficiency of our algorithm through simulations and experiments, highlighting its advantages compared to the existing methods., NSFC [61991404, 62394342, U22A2049]; Liaoning Revitalization Talents Program [XLYC2007135]; Science and Technology Major Project of Liaoning Province [2020JH1, 10100008]; Key Research and Development Program of Liaoning Province [2023JH26, 10200011]; Research Program of the Liaoning Liaohe Laboratory [LLL23ZZ-05-01], This work was supported in part by NSFC under Grant 61991404, Grant 62394342, and Grant U22A2049; in part by Liaoning Revitalization Talents Program under Grant XLYC2007135; in part by the 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008; in part by the Key Research and Development Program of Liaoning Province under Grant 2023JH26/10200011; and in part by the Research Program of the Liaoning Liaohe Laboratory under Grant LLL23ZZ-05-01.
- Published
- 2024
37. Adaptive Learning Based Output-Feedback Optimal Control of CT Two-Player Zero-Sum Games.
- Author
-
Zhao, Jun, Lv, Yongfeng, and Zhao, Ziliang
- Abstract
Although optimal control with full state-feedback has been well studied, online solving output-feedback optimal control problem is difficult, in particular for learning online Nash equilibrium solution of the continuous-time (CT) two-player zero-sum differential games. For this purpose, we propose an adaptive learning algorithm to address this trick problem. A modified game algebraic Riccati equation (MGARE) is derived by tailoring its state-feedback control counterpart. An adaptive online learning method is proposed to approximate the solution to the MGARE through online data, where two operations (i.e., vectorization and Kronecker’s product) can be adopted to reconstruct the MGARE. Only system output information is needed to implement developed learning algorithm. Simulation results are carried out to exemplify the proposed control and learning method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Subgame Maxmin Strategies in Zero-Sum Stochastic Games with Tolerance Levels.
- Author
-
Flesch, János, Herings, P. Jean-Jacques, Maes, Jasmine, and Predtetchinski, Arkadi
- Abstract
We study subgame ϕ -maxmin strategies in two-player zero-sum stochastic games with a countable state space, finite action spaces, and a bounded and universally measurable payoff function. Here, ϕ denotes the tolerance function that assigns a nonnegative tolerated error level to every subgame. Subgame ϕ -maxmin strategies are strategies of the maximizing player that guarantee the lower value in every subgame within the subgame-dependent tolerance level as given by ϕ . First, we provide necessary and sufficient conditions for a strategy to be a subgame ϕ -maxmin strategy. As a special case, we obtain a characterization for subgame maxmin strategies, i.e., strategies that exactly guarantee the lower value at every subgame. Secondly, we present sufficient conditions for the existence of a subgame ϕ -maxmin strategy. Finally, we show the possibly surprising result that each game admits a strictly positive tolerance function ϕ ∗ with the following property: if a player has a subgame ϕ ∗ -maxmin strategy, then he has a subgame maxmin strategy too. As a consequence, the existence of a subgame ϕ -maxmin strategy for every positive tolerance function ϕ is equivalent to the existence of a subgame maxmin strategy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
39. Online event-triggered adaptive critic design for multi-player zero-sum games of partially unknown nonlinear systems with input constraints.
- Author
-
Liu, Pengda, Zhang, Huaguang, Ren, He, and Liu, Chong
- Subjects
- *
MULTIPLAYER games , *NONLINEAR systems , *COST functions , *TELECOMMUNICATION systems , *ALGORITHMS - Abstract
This paper focuses on the design of online event-triggered optimal control strategy for multi-player zero-sum games (MP-ZSGs) with control constraints when the system model is partially unknown. Non-quadratic functions are utilized to construct the cost functions under the condition of control constraints. The proposed algorithm is designed based on the framework of identifier-critic networks. The unknown drift dynamics model is reconstructed by an identifier neural network (INN) using the input and output data. The near-optimal event-based controls and time-based disturbances are designed by training a critic neural network (CNN). With the aid of the designed event-triggered mechanism (ETM), the needless computing and communication actions of the system signals have been reduced so as to save computing/communication resources. Meanwhile, to remove the persistence of excitation (PE) condition, the historical and current data are utilized to construct a modified tuning law of CNN. Theoretically, the uniform ultimate boundedness (UUB) properties of the system states and the critic weights errors are proved by Lyapunov approach. Moreover, the Zeno behavior is proved to be excluded under the designed triggering condition. Finally, the convergence and performance of the online method are verified by simulating a representative example. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. A Mean Field Approach for Discounted Zero-Sum Games in a Class of Systems of Interacting Objects.
- Author
-
Higuera-Chan, Carmen G. and Minjárez-Sosa, J. Adolfo
- Abstract
The paper deals with systems composed of a large number of N interacting objects (e.g., agents, particles) controlled by two players defining a stochastic zero-sum game. The objects can be classified according to a finite set of classes or categories over which they move randomly. Because N is too large, the game problem is studied following a mean field approach. That is, a zero-sum game model GM N , where the states are the proportions of objects in each class, is introduced. Then, letting N → ∞ (the mean field limit) we obtain a new game model GM , independent on N, which is easier to analyze than GM N . Considering a discounted optimality criterion, our objective is to prove that an optimal pair of strategies in GM is an approximate optimal pair as N → ∞ in the original game model GM N . [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
41. Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate.
- Author
-
Wang, Yuan, Wang, Ding, Zhao, Mingming, Liu, Nan, and Qiao, Junfei
- Subjects
- *
ZERO sum games , *DYNAMIC programming - Abstract
In this paper, an adjustable Q-learning scheme is developed to solve the discrete-time nonlinear zero-sum game problem, which can accelerate the convergence rate of the iterative Q-function sequence. First, the monotonicity and convergence of the iterative Q-function sequence are analyzed under some conditions. Moreover, by employing neural networks, the model-free tracking control problem can be overcome for zero-sum games. Second, two practical algorithms are designed to guarantee the convergence with accelerated learning. In one algorithm, an adjustable acceleration phase is added to the iteration process of Q-learning, which can be adaptively terminated with convergence guarantee. In another algorithm, a novel acceleration function is developed, which can adjust the relaxation factor to ensure the convergence. Finally, through a simulation example with the practical physical background, the fantastic performance of the developed algorithm is demonstrated with neural networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Zero-Sum Differential Games
- Author
-
Cardaliaguet, Pierre, Rainer, Catherine, Başar, Tamer, editor, and Zaccour, Georges, editor
- Published
- 2018
- Full Text
- View/download PDF
43. Introduction to the Theory of Games
- Author
-
Başar, Tamer, Başar, Tamer, editor, and Zaccour, Georges, editor
- Published
- 2018
- Full Text
- View/download PDF
44. Game Theory
- Author
-
Aumann, R. J. and Macmillan Publishers Ltd
- Published
- 2018
- Full Text
- View/download PDF
45. Zero-Sum Games
- Author
-
Bacharach, Michael and Macmillan Publishers Ltd
- Published
- 2018
- Full Text
- View/download PDF
46. Optimality of Two-Parameter Strategies in Stochastic Control
- Author
-
Yamazaki, Kazutoshi, Dereich, Steffen, Series Editor, Khoshnevisan, Davar, Series Editor, Kyprianou, Andreas E., Series Editor, Resnick, Sidney I., Series Editor, Hernández-Hernández, Daniel, editor, Pardo, Juan Carlos, editor, and Rivero, Victor, editor
- Published
- 2018
- Full Text
- View/download PDF
47. COMPARISON OF INFORMATION STRUCTURES FOR ZERO-SUM GAMES AND A PARTIAL CONVERSE TO BLACKWELL ORDERING IN STANDARD BOREL SPACES.
- Author
-
HOGEBOOM-BURR, IAN and YÜKSEL, SERDAR
- Subjects
- *
STATISTICAL decision making , *DECISION theory , *COST functions , *GAMES , *STOCHASTIC orders - Abstract
In statistical decision theory involving a single decision maker, an information structure is said to be better than another one if for any cost function involving a hidden state variable and an action variable which is restricted to be conditionally independent from the state given some measurement, the solution value under the former is not worse than that under the latter. For finite spaces, a theorem due to Blackwell leads to a complete characterization on when one information structure is better than another. For stochastic games, in general, such an ordering is not possible since additional information can lead to equilibria perturbations with positive or negative values to a player. However, for zero-sum games in a finite probability space, P, eski introduced a complete characterization of ordering of information structures. In this paper, we obtain an infinite-dimensional (standard Borel) generalization of P, eski's result. A corollary is that more information cannot hurt a decision maker taking part in a zero-sum game. We establish two supporting results which are essential and explicit though modest improvements on prior literature: (i) a partial converse to Blackwell's ordering in the standard Borel setup and (ii) an existence result for equilibria in zero-sum games with incomplete information. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
48. An Angel-Daemon Approach to Assess the Uncertainty in the Power of a Collectivity to Act
- Author
-
Fragnito, Giulia, Gabarro, Joaquim, Serna, Maria, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Antonucci, Alessandro, editor, Cholvy, Laurence, editor, and Papini, Odile, editor
- Published
- 2017
- Full Text
- View/download PDF
49. A Dynkin Game on Assets with Incomplete Information on the Return.
- Author
-
De Angelis, Tiziano, Gensbittel, Fabien, and Villeneuve, Stephane
- Subjects
NASH equilibrium ,RATE of return ,ASSETS (Accounting) ,GAMES ,MARKOV processes - Abstract
This paper studies a two-player zero-sum Dynkin game arising from pricing an option on an asset whose rate of return is unknown to both players. Using filtering techniques, we first reduce the problem to a zero-sum Dynkin game on a bidimensional diffusion (X,Y). Then we characterize the existence of a Nash equilibrium in pure strategies in which each player stops at the hitting time of (X,Y) to a set with a moving boundary. A detailed description of the stopping sets for the two players is provided along with global C
1 regularity of the value function. [ABSTRACT FROM AUTHOR]- Published
- 2021
- Full Text
- View/download PDF
50. Adaptive Dynamics Programming for H∞ Control of Continuous-Time Unknown Nonlinear Systems via Generalized Fuzzy Hyperbolic Models.
- Author
-
Su, Hanguang, Zhang, Huaguang, Gao, David Wenzhong, and Luo, Yanhong
- Subjects
- *
NONLINEAR systems , *CLOSED loop systems , *DYNAMIC programming , *SYSTEM dynamics , *ALGORITHMS - Abstract
In this paper, a novel adaptive dynamic programming (ADP) algorithm is developed for the infinite-horizon ($H_{\infty}$) optimal control problems with unknown continuous-time (CT) nonlinear systems subject to external disturbances. To facilitate the implementation of the algorithm, generalized fuzzy hyperbolic models (GFHMs) are utilized to establish an identifier–critic architecture, where the identifier is designed to reconstruct the unknown system dynamics, and the GFHM-based critic network is employed to approximate the value functions. The CT $H_{\infty}$ optimal control issue is converted into a two-player zero-sum game and the corresponding Hamilton–Jacobi–Isaacs equation is derived. The learning procedure of the critic design is adaptively implemented with the help of the reconstructed model, thus the requirement of the complete system dynamics is relaxed. Furthermore, by the means of Lyapunov direct method, the uniform ultimate boundedness stability analysis of the closed-loop control system is explicitly provided. Finally, to compare the control performances and disturbance attenuation properties of the proposed method and the existing ADP algorithms, two numerical examples are given. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.