Back to Search Start Over

Integrating Risk-Averse and Constrained Reinforcement Learning for Robust Decision-Making in High-Stakes Scenarios

Authors :
Moiz Ahmad
Muhammad Babar Ramzan
Muhammad Omair
Muhammad Salman Habib
Source :
Mathematics, Vol 12, Iss 13, p 1954 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

This paper considers a risk-averse Markov decision process (MDP) with non-risk constraints as a dynamic optimization framework to ensure robustness against unfavorable outcomes in high-stakes sequential decision-making situations such as disaster response. In this regard, strong duality is proved while making no assumptions on the problem’s convexity. This is necessary for some real-world issues, e.g., in the case of deprivation costs in the context of disaster relief, where convexity cannot be ensured. Our theoretical results imply that the problem can be exactly solved in a dual domain where it becomes convex. Based on our duality results, an augmented Lagrangian-based constraint handling mechanism is also developed for risk-averse reinforcement learning algorithms. The mechanism is proved to be theoretically convergent. Finally, we have also empirically established the convergence of the mechanism using a multi-stage disaster response relief allocation problem while using a fixed negative reward scheme as a benchmark.

Details

Language :
English
ISSN :
12131954 and 22277390
Volume :
12
Issue :
13
Database :
Directory of Open Access Journals
Journal :
Mathematics
Publication Type :
Academic Journal
Accession number :
edsdoj.904ac86944514714b911b8dff343628f
Document Type :
article
Full Text :
https://doi.org/10.3390/math12131954