Start Over

Decentralized Adaptive temporal-difference learning over time-varying networks and its finite-time analysis.

Authors :: Xie, Ping
Wang, Xin
Yao, Shan
Liu, Muhua
Zhao, Xuhui
Zheng, Ruijuan
Source :: Neurocomputing. Nov2024, Vol. 604, pN.PAG-N.PAG. 1p.
Publication Year :: 2024
Abstract: In reinforcement learning, centralized temporal-difference (TD) learning is commonly used to solve the policy evaluation problem. However, the decentralized adaptive variant of the TD learning algorithm has rarely been investigated in multi-agent reinforcement learning. To fill this gap, based on linear function approximation, we propose a decentralized adaptive TD learning algorithm over time-varying networks, referred to as D-AdaTD. We rigorously analyze the convergence performance of D-AdaTD , i.e., the explicit finite-time analysis is established under different step-sizes. Specifically, under constant step-sizes, the average estimated value function can converge to a neighborhood of the optimal value at rate O (1 / (k + 1)) and the estimated parameter of each agent can converge to a neighborhood of the optimal parameter at rate O (ξ k) , where k is the number of iterations and ξ ∈ (0 , 1). Under diminishing step-sizes, the average estimated value function can converge to the optimal value and the average estimated parameter can converge to the optimal parameter at rate O ((1 + log (k + 1)) / k) and O ((1 + log (k + 1)) / (k + 1)) , respectively. In addition, we evaluate the performance of D-AdaTD via simulation experiments, which are commonly insufficient in the existing decentralized temporal-difference learning. The experimental results also validate the effectiveness of D-AdaTD. [ABSTRACT FROM AUTHOR]

Subjects :: *MACHINE learning
*TIME-varying networks
*NEIGHBORHOODS

Details

Language :: English
ISSN :: 09252312
Volume :: 604
Database :: Academic Search Index
Journal :: Neurocomputing
Publication Type :: Academic Journal
Accession number :: 179364708
Full Text :: https://doi.org/10.1016/j.neucom.2024.128311

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Decentralized Adaptive temporal-difference learning over time-varying networks and its finite-time analysis.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Decentralized Adaptive temporal-difference learning over time-varying networks and its finite-time analysis.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources