Back to Search
Start Over
Seeking Nash Equilibrium for Linear Discrete-time Systems via Off-policy Q-learning.
- Source :
-
IAENG International Journal of Applied Mathematics . Nov2024, Vol. 54 Issue 11, p2477-2483. 7p. - Publication Year :
- 2024
-
Abstract
- This paper considers a non-zero-sum game for linear discrete-time systems involving two players. Based on a quadratic value function, we derive coupled algebraic Riccati equations. Then, we propose both on-policy and off-policy Q-learning algorithms, which operate without prior knowledge of the system dynamics, to achieve Nash equilibrium. These algorithms necessitate the inclusion of probing noise to ensure the persistence of excitation. We show that the on-policy Q-learning algorithm may introduce bias to the Nash equilibrium due to the probing noise, while the off-policy Q-learning algorithm maintains an unbiased property. Finally, we offer a numerical example to validate the effectiveness of the presented on-policy and off-policy Q-learning algorithms. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 19929978
- Volume :
- 54
- Issue :
- 11
- Database :
- Academic Search Index
- Journal :
- IAENG International Journal of Applied Mathematics
- Publication Type :
- Academic Journal
- Accession number :
- 181474327