Author: "Abu-Khalaf, M" / Publisher: institute of electrical and electronics engineers, inc - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Abu-Khalaf, M"' showing total 2 results

Start Over Author "Abu-Khalaf, M" Publisher institute of electrical and electronics engineers, inc

Author: Al-Tamimi A, Lewis FL, and Abu-Khalaf M
Subjects: Computer Simulation, Feedback, Algorithms, Models, Theoretical, Programming, Linear, Systems Theory
Abstract: Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP without knowing the internal dynamics of the system. The exact solution assumption holds for some classes of nonlinear systems and, specifically, in the specific case of the DT linear quadratic regulator (LQR), where the action is linear and the value quadratic in the states and NNs have zero approximation error. It is stressed that, for the LQR, HDP may be implemented without knowing the system A matrix by using two NNs. This fact is not generally appreciated in the folklore of HDP for the DT LQR, where only one critic NN is generally used.
Published: 2008
Full Text: View/download PDF

Author: Al-Tamimi A, Abu-Khalaf M, and Lewis FL
Subjects: Computer Simulation, Artificial Intelligence, Game Theory, Models, Theoretical, Signal Processing, Computer-Assisted
Abstract: In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This results in a forward-in-time reinforcement learning algorithm that converges to the Nash equilibrium of the corresponding zero-sum game. The results in this correspondence can be thought of as a way to solve the Riccati equation of the well-known discrete-time H(infinity) optimal control problem forward in time. Two schemes are presented, namely: 1) a heuristic dynamic programming and 2) a dual-heuristic dynamic programming, to solve for the value function and the costate of the game, respectively. An H(infinity) autopilot design for an F-16 aircraft is presented to illustrate the results.
Published: 2007
Full Text: View/download PDF

Books, media, physical & digital resources

Searchworks