Back to Search
Start Over
Multiplayer Bandits Without Observing Collision Information.
- Source :
- Mathematics of Operations Research; May2022, Vol. 47 Issue 2, p1247-1265, 19p
- Publication Year :
- 2022
-
Abstract
- We study multiplayer stochastic multiarmed bandit problems in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup in which no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret and an algorithm with a square-root regret that does not depend on the gaps between the means. For the first model, we give the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also give an algorithm for reaching approximate Nash equilibria quickly in stochastic anticoordination games. [ABSTRACT FROM AUTHOR]
- Subjects :
- MULTI-armed bandit problem (Probability theory)
ROBBERS
NASH equilibrium
Subjects
Details
- Language :
- English
- ISSN :
- 0364765X
- Volume :
- 47
- Issue :
- 2
- Database :
- Complementary Index
- Journal :
- Mathematics of Operations Research
- Publication Type :
- Academic Journal
- Accession number :
- 157053297
- Full Text :
- https://doi.org/10.1287/moor.2021.1168