Start Over

Optimistic sequential multi-agent reinforcement learning with motivational communication.

Authors :: Huang A
Wang Y
Zhou X
Zou H
Dong X
Che X
Source :: Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2024 Nov; Vol. 179, pp. 106547. Date of Electronic Publication: 2024 Jul 22.
Publication Year :: 2024
Abstract: Centralized Training with Decentralized Execution (CTDE) is a prevalent paradigm in the field of fully cooperative Multi-Agent Reinforcement Learning (MARL). Existing algorithms often encounter two major problems: independent strategies tend to underestimate the potential value of actions, leading to the convergence on sub-optimal Nash Equilibria (NE); some communication paradigms introduce added complexity to the learning process, complicating the focus on the essential elements of the messages. To address these challenges, we propose a novel method called Optimistic Sequential Soft Actor Critic with Motivational Communication (OSSMC). The key idea of OSSMC is to utilize a greedy-driven approach to explore the potential value of individual policies, named optimistic Q-values, which serve as an upper bound for the Q-value of the current policy. We then integrate a sequential update mechanism with optimistic Q-value for agents, aiming to ensure monotonic improvement in the joint policy optimization process. Moreover, we establish motivational communication modules for each agent to disseminate motivational messages to promote cooperative behaviors. Finally, we employ a value regularization strategy from the Soft Actor Critic (SAC) method to maximize entropy and improve exploration capabilities. The performance of OSSMC was rigorously evaluated against a series of challenging benchmark sets. Empirical results demonstrate that OSSMC not only surpasses current baseline algorithms but also exhibits a more rapid convergence rate.<br />Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2024 Elsevier Ltd. All rights reserved.)

Subjects :: Communication
Humans
Neural Networks, Computer
Cooperative Behavior
Reinforcement, Psychology
Motivation
Algorithms

Details

Language :: English
ISSN :: 1879-2782
Volume :: 179
Database :: MEDLINE
Journal :: Neural networks : the official journal of the International Neural Network Society
Publication Type :: Academic Journal
Accession number :: 39068677
Full Text :: https://doi.org/10.1016/j.neunet.2024.106547

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Optimistic sequential multi-agent reinforcement learning with motivational communication.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Optimistic sequential multi-agent reinforcement learning with motivational communication.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources