Start Over

Simple Uncoupled No-regret Learning Dynamics for Extensive-form Correlated Equilibrium.

Authors :: FARINA, GABRIELE
CELLI, ANDREA
MARCHESI, ALBERTO
GATTI, NICOLA
Source :: Journal of the ACM; Nov2022, Vol. 69 Issue 6, p1-41, 41p
Publication Year :: 2022
Abstract: The existence of simple uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory ofmulti-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normalform game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensiveform (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and presence of private information in the game, correlation in extensive-form games possesses significantly different properties than in normalform games, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device (a.k.a. mediator) must take into account the evolution of beliefs of each player as they make observations throughout the game. Due to that significant added complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics that converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall.We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play after T game repetitions is proven to be a O(1/√ T )-approximate EFCE with high probability, and an EFCE almost surely in the limit. [ABSTRACT FROM AUTHOR]

Subjects :: POLYNOMIAL time algorithms
NORMAL forms (Mathematics)
EQUILIBRIUM
TREE size
STOCHASTIC learning models
REINFORCEMENT learning

Details

Language :: English
ISSN :: 00045411
Volume :: 69
Issue :: 6
Database :: Complementary Index
Journal :: Journal of the ACM
Publication Type :: Academic Journal
Accession number :: 160512964
Full Text :: https://doi.org/10.1145/3563772

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Simple Uncoupled No-regret Learning Dynamics for Extensive-form Correlated Equilibrium.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Simple Uncoupled No-regret Learning Dynamics for Extensive-form Correlated Equilibrium.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources