1. A Q-learning algorithm for Markov decision processes with continuous state spaces.
- Author
-
Hu, Jiaqiao, Yang, Xiangyu, Hu, Jian-Qiang, and Peng, Yijie
- Subjects
- *
MARKOV processes , *CONTINUOUS processing , *ASYNCHRONOUS learning , *OPTIMIZATION algorithms , *ALGORITHMS , *STOCHASTIC control theory - Abstract
We propose an online algorithm for solving a class of continuous-state Markov decision processes. The algorithm combines classical Q-learning with an asynchronous averaging procedure, which allows Q-function estimates at sampled state–action pairs to be adaptively updated based on observations collected along a single sample trajectory. These estimates are then used to iteratively construct an interpolation-based function approximator of the Q-function. We prove the convergence of the algorithm and provide numerical results to illustrate its performance. • Proposed a model-free Q-learning algorithm for solving a class of infinite horizon Markov decision processes with continuous state spaces. • Proved the strong convergence of the algorithm when used in conjunction with a class of non-linear function approximators. • Illustrated the performance of the proposed algorithm through simulation comparison studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF