Back to Search Start Over

An Impossibility Result in Automata-Theoretic Reinforcement Learning

Authors :
Hahn, Ernst Moritz
Perez, Mateo
Schewe, Sven
Somenzi, Fabio
Trivedi, Ashutosh
Wojtczak, Dominik
Bouajjani, Ahmed
Holík, Lukás
Wu, Zhilin
Formal Methods and Tools
Source :
Automated Technology for Verification and Analysis ISBN: 9783031199912, Automated Technology for Verification and Analysis, 13505, 42-57
Publication Year :
2022
Publisher :
Springer International Publishing, 2022.

Abstract

The expanding role of reinforcement learning (RL) in safety-critical system design has promoted ω -automata as a way to express learning requirements—often non-Markovian—with greater ease of expression and interpretation than scalar reward signals. When ω -automata were first proposed in model-free RL, deterministic Rabin acceptance conditions were used in an attempt to provide a direct translation from ω -automata to finite state “reward” machines defined over the same automaton structure (a memoryless reward translation). While these initial attempts to provide faithful, memoryless reward translations for Rabin acceptance conditions remained unsuccessful, translations were discovered for other acceptance conditions such as suitable, limit-deterministic Büchi acceptance or more generally, good-for-MDP Büchi acceptance conditions. Yet, the question “whether a memoryless translation of Rabin conditions to scalar rewards exists” remained unresolved. This paper presents an impossibility result implying that any attempt to use Rabin automata directly (without extra memory) for model-free RL is bound to fail. To establish this result, we show a link between a class of automata enabling memoryless reward translation to closure properties of its accepting and rejecting infinity sets, and to the insight that both the property and its complement need to allow for positional strategies for such an approach to work. We believe that such impossibility results will provide foundations for the application of RL to safety-critical systems.

Subjects

Subjects :
2023 OA procedure

Details

ISBN :
978-3-031-19991-2
ISBNs :
9783031199912
Database :
OpenAIRE
Journal :
Automated Technology for Verification and Analysis ISBN: 9783031199912, Automated Technology for Verification and Analysis, 13505, 42-57
Accession number :
edsair.doi.dedup.....f2c91899261efe9a84741aff8b182328
Full Text :
https://doi.org/10.1007/978-3-031-19992-9_3