Back to Search Start Over

Balancing Value Iteration and Policy Iteration for Discrete-Time Control.

Authors :
Luo, Biao
Yang, Yin
Wu, Huai-Ning
Huang, Tingwen
Source :
IEEE Transactions on Systems, Man & Cybernetics. Systems. Nov2020, Vol. 50 Issue 11, p3948-3958. 11p.
Publication Year :
2020

Abstract

The optimal control problem of discrete-time nonlinear systems depends on the solution of the Bellman equation. In this paper, an adaptive reinforcement learning (RL) method is developed to solve the complex Bellman equation, which balances value iteration (VI) and policy iteration (PI). By adding a balance parameter, an adaptive RL integrates VI and PI together, which accelerates VI and avoids the need of an initial admissible control. The convergence of the adaptive RL is proved by showing that it converges to the Bellman equation. Subsequently, the adaptive RL is realized by using the neural network (NN) approximation for value function and a least-squares scheme is developed for updating NN weights. Then, the convergence of NN-based adaptive RL is proved with considering NN approximation error. To further improve its performance, an adaptive rule is developed for tuning balance parameter in adaptive RL iteration by iteration. Finally, the effectiveness of the adaptive RL is validated with simulation studies. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21682216
Volume :
50
Issue :
11
Database :
Academic Search Index
Journal :
IEEE Transactions on Systems, Man & Cybernetics. Systems
Publication Type :
Academic Journal
Accession number :
146472560
Full Text :
https://doi.org/10.1109/TSMC.2019.2898389