Back to Search
Start Over
Q-Sorting: An Algorithm for Reinforcement Learning Problems with Multiple Cumulative Constraints
- Source :
- Mathematics, Vol 12, Iss 13, p 2001 (2024)
- Publication Year :
- 2024
- Publisher :
- MDPI AG, 2024.
-
Abstract
- This paper proposes a method and an algorithm called Q-sorting for reinforcement learning (RL) problems with multiple cumulative constraints. The primary contribution is a mechanism for dynamically determining the focus of optimization among multiple cumulative constraints and the objective. Executed actions are picked through a procedure with two steps: first filter out actions potentially breaking the constraints, and second sort the remaining ones according to the Q values of the focus in descending order. The algorithm was originally developed upon the classic tabular value representation and episodic setting of RL, but the idea can be extended and applied to other methods with function approximation and discounted setting. Numerical experiments are carried out on the adapted Gridworld and the motor speed synchronization problem, both with one and two cumulative constraints. Simulation results validate the effectiveness of the proposed Q-sorting in that cumulative constraints are honored both during and after the learning process. The advantages of Q-sorting are further emphasized through comparison with the method of lumped performances (LP), which takes constraints into account through weighting parameters. Q-sorting outperforms LP in both ease of use (unnecessity of trial and error to determine values of the weighting parameters) and performance consistency (6.1920 vs. 54.2635 rad/s for the standard deviation of the cumulative performance index over 10 repeated simulation runs). It has great potential for practical engineering use.
Details
- Language :
- English
- ISSN :
- 22277390
- Volume :
- 12
- Issue :
- 13
- Database :
- Directory of Open Access Journals
- Journal :
- Mathematics
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.f0bc846f24844c81b93a1bbce13ff1bc
- Document Type :
- article
- Full Text :
- https://doi.org/10.3390/math12132001