Back to Search Start Over

Towards Optimal Attacks on Reinforcement Learning Policies

Publication Year :
2021

Abstract

Control policies, trained using Deep Reinforcement Learning, have been recently shown to be vulnerable to adversarial attacks introducing even minimal perturbations to the policy input. The attacks proposed so far have been designed using heuristics, based on existing adversarial example crafting techniques used to dupe classifiers in supervised learning. In contrast, this paper investigates the problem of devising optimal attacks, depending on a well-defined attacker's objective, e.g., to minimize the main agent average reward. When the policy and the system dynamics, as well as the rewards, are known to the attacker, a scenario referred to as a white-box attack, designing optimal attacks amounts to solving a Markov Decision Process. For what we call black-box attacks, where neither the policy nor the system is known, optimal attacks can be trained using Reinforcement Learning. We present numerical experiments demonstrating the efficiency of our attacks compared to existing attacks. We further quantify the potential impact of attacks and establish its connection to the smoothness of the policy under attack. Smooth policies are naturally less prone to attacks (e.g. Lipschitz policies, with respect to the state, are more resilient). Finally, we show that from the main agent perspective, the system uncertainties induced by the attack can be modelled using a Partially Observable Markov Decision Process (POMDP) framework. We demonstrate that using Reinforcement Learning methods tailored to POMDP (e.g. using Recurrent Neural Networks) leads to more resilient policies.<br />Part of proceedings: ISBN 978-1-6654-4197-1, QC 20230117

Details

Database :
OAIster
Notes :
Russo, Alessio, Proutiere, Alexandre
Publication Type :
Electronic Resource
Accession number :
edsoai.on1312824511
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.23919.ACC50511.2021.9483025