Two-Dimensional Ballistic Correction Decision-Making Based on Constrained Reinforcement Learning.

Authors :: Lei, Xiaoyun
Zhang, Lin
Zhu, Lihua
Source :: Journal of Circuits, Systems & Computers. Sep2024, p1. 21p. 9 Illustrations.
Publication Year :: 2024
Abstract: In response to the poor decision-making caused by aerodynamic coupling and model linearization in the trajectory correction model, it is proposed to use a reinforcement learning method to establish a Constrained Markov Decision Process (CMDP) model between action prediction and action constraint. Taking a classical two-dimensional trajectory correction projectile for study, the cost function within constraints of thruster violations is built, so that the sequential prediction is transformed to solve the Constrained Reinforcement Learning (CRL) problem. Accordingly, a method combined with the policy gradient algorithm based on Lagrange operator with the model prediction algorithm is proposed. The test results show that the trained optimal policy model achieves a probability of over 65% and the correction projectile falls within a radius of 5 meters from the preset target point (CEP <5m). Simulation examples have verified the decision-making of correction action sequences autonomously using the trained model, resulting in correction errors of 1.79 1.1 and 0.42m in the x,y and <italic>z</italic> directions, respectively, demonstrating high correction accuracy and autonomous decision-making capability. [ABSTRACT FROM AUTHOR]

Subjects :: *REINFORCEMENT learning
*COST functions
*DECISION making
*MARKOV processes
*PREDICTION models

Full Text Access

Tools