Back to Search Start Over

MDP Geometry, Normalization and Reward Balancing Solvers

Authors :
Mustafin, Arsenii
Pakharev, Aleksei
Olshevsky, Alex
Paschalidis, Ioannis Ch.
Publication Year :
2024

Abstract

The Markov Decision Process (MDP) is a widely used mathematical model for sequential decision-making problems. In this paper, we present a new geometric interpretation of MDPs with a natural normalization procedure that allows us to adjust the value function at each state without altering the advantage of any action with respect to any policy. This procedure enables the development of a novel class of algorithms for solving MDPs that find optimal policies without explicitly computing policy values. The new algorithms we propose for different settings achieve and, in some cases, improve upon state-of-the-art sample complexity results.<br />Comment: Preliminary version

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2407.06712
Document Type :
Working Paper