Back to Search Start Over

Multiplayer Bandits Without Observing Collision Information.

Authors :
Lugosi, Gábor
Mehrabian, Abbas
Source :
Mathematics of Operations Research; May2022, Vol. 47 Issue 2, p1247-1265, 19p
Publication Year :
2022

Abstract

We study multiplayer stochastic multiarmed bandit problems in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup in which no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret and an algorithm with a square-root regret that does not depend on the gaps between the means. For the first model, we give the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also give an algorithm for reaching approximate Nash equilibria quickly in stochastic anticoordination games. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0364765X
Volume :
47
Issue :
2
Database :
Complementary Index
Journal :
Mathematics of Operations Research
Publication Type :
Academic Journal
Accession number :
157053297
Full Text :
https://doi.org/10.1287/moor.2021.1168