Start Over

An online hyper‐volume action bounding approach for accelerating the process of deep reinforcement learning from multiple controllers.

Authors :: Aflakian, Ali
Rastegarpanah, Alireza
Hathaway, Jamie
Stolkin, Rustam
Source :: Journal of Field Robotics; Sep2024, Vol. 41 Issue 6, p1814-1828, 15p
Publication Year :: 2024
Abstract: This paper fuses ideas from reinforcement learning (RL), Learning from Demonstration (LfD), and Ensemble Learning into a single paradigm. Knowledge from a mixture of control algorithms (experts) are used to constrain the action space of the agent, enabling faster RL refining of a control policy, by avoiding unnecessary explorative actions. Domain‐specific knowledge of each expert is exploited. However, the resulting policy is robust against errors of individual experts, since it is refined by a RL reward function without copying any particular demonstration. Our method has the potential to supplement existing RLfD methods when multiple algorithmic approaches are available to function as experts, specifically in tasks involving continuous action spaces. We illustrate our method in the context of a visual servoing (VS) task, in which a 7‐DoF robot arm is controlled to maintain a desired pose relative to a target object. We explore four methods for bounding the actions of the RL agent during training. These methods include using a hypercube and convex hull with modified loss functions, ignoring actions outside the convex hull, and projecting actions onto the convex hull. We compare the training progress of each method using expert demonstrators, employing one expert demonstrator with the DAgger algorithm, and without using any demonstrators. Our experiments show that using the convex hull with a modified loss function not only accelerates learning but also provides the most optimal solution compared with other approaches. Furthermore, we demonstrate faster VS error convergence while maintaining higher manipulability of the arm, compared with classical image‐based VS, position‐based VS, and hybrid‐decoupled VS. [ABSTRACT FROM AUTHOR]

Subjects :: DEEP reinforcement learning
ONLINE education
MATHEMATICAL optimization
ALGORITHMS

Details

Language :: English
ISSN :: 15564959
Volume :: 41
Issue :: 6
Database :: Complementary Index
Journal :: Journal of Field Robotics
Publication Type :: Academic Journal
Accession number :: 178854641
Full Text :: https://doi.org/10.1002/rob.22355

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

An online hyper‐volume action bounding approach for accelerating the process of deep reinforcement learning from multiple controllers.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

An online hyper‐volume action bounding approach for accelerating the process of deep reinforcement learning from multiple controllers.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources