Back to Search Start Over

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Authors :
Jonckheere, Matthieu
Mignacco, Chiara
Stoltz, Gilles
Publication Year :
2023

Abstract

Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we present simulations for a stochastic matching toy model.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2310.16473
Document Type :
Working Paper