1. Stochastic Bandits With Non-Stationary Rewards: Reward Attack and Defense
- Author
-
Yang, Chenye, Liu, Guanlin, and Lai, Lifeng
- Abstract
In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. We consider three increasingly general attack scenarios, each of which has different assumptions about the environment, victim algorithm and information available to the attacker. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. We also propose a defense non-stationary algorithm that is able to defend any attacker whose attack cost is bounded by a budget, and prove that it is robust to attacks. The simulation results validate our theoretical analysis.
- Published
- 2024
- Full Text
- View/download PDF