TY - GEN
T1 - A Path Planning Method Based on Deep Reinforcement Learning with Improved Prioritized Experience Replay for Human-Robot Collaboration
AU - Sun, Deyu
AU - Wen, Jingqian
AU - Wang, Jingfei
AU - Yang, Xiaonan
AU - Hu, Yaoguang
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Owing to its ability to integrate human flexibility with robotic automation, human-robot collaboration possesses tremendous potential in intelligent manufacturing. A quintessential characteristic of this collaboration is the necessity for robotic arms to cooperate with humans in a dynamically changing environment, wherein humans could be considered as dynamic obstacles. One of the significant challenges in human-robot collaboration is the development of obstacle avoidance strategies for robotic path planning within dynamically changing environments. The inability of traditional two-dimensional path planning methods to handle high-dimensional spaces, therefore, many researchers have turned their attention to deep reinforcement learning, and many deep reinforcement learning methods have been applied to robotic arm path planning. However, most deep reinforcement learning models for robotic arm path planning require a significant amount of training time to achieve convergence. In this study, we introduce an algorithm that synergizes Soft Actor-Critic (SAC) with an improved version of Prioritized Experience Replay (PER)—SAC-iPER. We prioritizes experiences based on task-rewards, employing metrics such as time consumption and collision occurrences, in addition to task completion, to rank experiences. This reward-based ordering significantly boosts the learning process in both speed and quality. The results of this study significantly enhanced the training efficiency of deep reinforcement learning models for robotic arm path planning within human-robot collaboration, paving the way for the development of more efficient human-robot collaborative systems.
AB - Owing to its ability to integrate human flexibility with robotic automation, human-robot collaboration possesses tremendous potential in intelligent manufacturing. A quintessential characteristic of this collaboration is the necessity for robotic arms to cooperate with humans in a dynamically changing environment, wherein humans could be considered as dynamic obstacles. One of the significant challenges in human-robot collaboration is the development of obstacle avoidance strategies for robotic path planning within dynamically changing environments. The inability of traditional two-dimensional path planning methods to handle high-dimensional spaces, therefore, many researchers have turned their attention to deep reinforcement learning, and many deep reinforcement learning methods have been applied to robotic arm path planning. However, most deep reinforcement learning models for robotic arm path planning require a significant amount of training time to achieve convergence. In this study, we introduce an algorithm that synergizes Soft Actor-Critic (SAC) with an improved version of Prioritized Experience Replay (PER)—SAC-iPER. We prioritizes experiences based on task-rewards, employing metrics such as time consumption and collision occurrences, in addition to task completion, to rank experiences. This reward-based ordering significantly boosts the learning process in both speed and quality. The results of this study significantly enhanced the training efficiency of deep reinforcement learning models for robotic arm path planning within human-robot collaboration, paving the way for the development of more efficient human-robot collaborative systems.
KW - deep reinforcement learning
KW - human-robot collaboration
KW - PER
KW - SAC
UR - http://www.scopus.com/inward/record.url?scp=85195862214&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-60412-6_15
DO - 10.1007/978-3-031-60412-6_15
M3 - Conference contribution
AN - SCOPUS:85195862214
SN - 9783031604119
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 196
EP - 206
BT - Human-Computer Interaction - Thematic Area, HCI 2024, Held as Part of the 26th HCI International Conference, HCII 2024, Proceedings
A2 - Kurosu, Masaaki
A2 - Hashizume, Ayako
PB - Springer Science and Business Media Deutschland GmbH
T2 - Human Computer Interaction thematic area of the 26th International Conference on Human-Computer Interaction, HCII 2024
Y2 - 29 June 2024 through 4 July 2024
ER -