201. Convergence of Recurrent Neuro-Fuzzy Value-Gradient Learning With and Without an Actor
- Author
-
Seaar Al-Dabooni and Donald C. Wunsch
- Subjects
Discrete mathematics ,Neuro-fuzzy ,Computer science ,Applied Mathematics ,Mobile robot ,02 engineering and technology ,Function (mathematics) ,Optimal control ,Dynamic programming ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Computational Theory and Mathematics ,Artificial Intelligence ,Control and Systems Engineering ,Adaptive system ,Convergence (routing) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Affine transformation - Abstract
In recent years, a gradient of the $n$ -step temporal-difference [TD( $\lambda$ )] learning has been developed to present an advanced adaptive dynamic programming (ADP) algorithm, called value-gradient learning [VGL( $\lambda$ )]. In this paper, we improve the VGL( $\lambda$ ) architecture, which is called the “single adaptive actor network [SNVGL( $\lambda$ )]” because it has only a single approximator function network (critic) instead of dual networks (critic and actor) as in VGL( $\lambda$ ). Therefore, SNVGL( $\lambda$ ) has lower computational requirements when compared to VGL( $\lambda$ ). Moreover, in this paper, a recurrent hybrid neuro-fuzzy (RNF) and a first-order Takagi–Sugeno RNF (TSRNF) are derived and implemented to build the critic and actor networks. Furthermore, we develop the novel study of the theoretical convergence proofs for both VGL( $\lambda$ ) and SNVGL( $\lambda$ ) under certain conditions. In this paper, mobile robot simulation model (model based) is used to solve the optimal control problem for affine nonlinear discrete-time systems. Mobile robot is exposed various noise levels to verify the performance and to validate the theoretical analysis.
- Published
- 2020
- Full Text
- View/download PDF