Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay

Authors :: Jung-Su Kim
Myeong Seop Kim
Dong Ki Han
Jae-Han Park
Source :: Applied Sciences, Vol 10, Iss 2, p 575 (2020), Applied Sciences, Volume 10, Issue 2
Publication Year :: 2020
Publisher :: MDPI AG, 2020.
Abstract: In order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilistic Roadmap) provides feasible paths when the starting and goal positions of a robot manipulator are given, the path might not be smooth enough, which can lead to inefficient performance of the robot system. This paper proposes a motion planning algorithm for robot manipulators using a twin delayed deep deterministic policy gradient (TD3) which is a reinforcement learning algorithm tailored to MDP with continuous action. Besides, hindsight experience replay (HER) is employed in the TD3 to enhance sample efficiency. Since path planning for a robot manipulator is an MDP (Markov Decision Process) with sparse reward and HER can deal with such a problem, this paper proposes a motion planning algorithm using TD3 with HER. The proposed algorithm is applied to 2-DOF and 3-DOF manipulators and it is shown that the designed paths are smoother and shorter than those designed by PRM.

Subjects :: 0209 industrial biotechnology
reinforcement learning
Computer science
hindsight experience replay (her)
02 engineering and technology
Probabilistic roadmap
lcsh:Technology
Motion (physics)
Task (project management)
lcsh:Chemistry
020901 industrial engineering & automation
0202 electrical engineering, electronic engineering, information engineering
Reinforcement learning
General Materials Science
probabilistic roadmap (prm)
Motion planning
Instrumentation
lcsh:QH301-705.5
Fluid Flow and Transfer Processes
lcsh:T
Process Chemistry and Technology
General Engineering
Control engineering
motion planning
lcsh:QC1-999
Computer Science Applications
lcsh:Biology (General)
lcsh:QD1-999
lcsh:TA1-2040
Path (graph theory)
020201 artificial intelligence & image processing
Markov decision process
lcsh:Engineering (General). Civil engineering (General)
Hindsight bias
lcsh:Physics
policy gradient

Tools