Back to Search Start Over

Deep Reinforcement Learning with Temporal Logics

Authors :
Alessandro Abate
Daniel Kroening
Mohammadhosein Hasanbeig
Source :
Lecture Notes in Computer Science ISBN: 9783030576271, FORMATS
Publication Year :
2020
Publisher :
Springer International Publishing, 2020.

Abstract

The combination of data-driven learning methods with formal reasoning has seen a surge of interest, as either area has the potential to bolstering the other. For instance, formal methods promise to expand the use of state-of-the-art learning approaches in the direction of certification and sample efficiency. In this work, we propose a deep Reinforcement Learning (RL) method for policy synthesis in continuous-state/action unknown environments, under requirements expressed in Linear Temporal Logic (LTL). We show that this combination lifts the applicability of deep RL to complex temporal and memory-dependent policy synthesis goals. We express an LTL specification as a Limit Deterministic Buchi Automaton (LDBA) and synchronise it on-the-fly with the agent/environment. The LDBA in practice monitors the environment, acting as a modular reward machine for the agent: accordingly, a modular Deep Deterministic Policy Gradient (DDPG) architecture is proposed to generate a low-level control policy that maximises the probability of the given LTL formula. We evaluate our framework in a cart-pole example and in a Mars rover experiment, where we achieve near-perfect success rates, while baselines based on standard RL are shown to fail in practice.

Details

ISBN :
978-3-030-57627-1
ISBNs :
9783030576271
Database :
OpenAIRE
Journal :
Lecture Notes in Computer Science ISBN: 9783030576271, FORMATS
Accession number :
edsair.doi...........9712ff7f273f25194cbb0fe5b2fce5ad
Full Text :
https://doi.org/10.1007/978-3-030-57628-8_1