TY - GEN
T1 - Pose Control of a Multi-Segment Soft Continuum Robot Based on HR-DP Algorithm
AU - Ding, Yi
AU - Dong, Jiaxiang
AU - Li, Wei
AU - Wang, Chunbao
AU - Hu, Xiping
AU - Liu, Quanquan
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Soft continuum robots have broad application prospects due to their superior flexibility, compliance and safety. However, the high mechanical complexity addresses a major challenge in precise pose control modeling. Deep Deterministic Policy Gradient (DDPG) algorithm, as a foundational modelfree, off-policy method, is well-suited for such tasks. However, DDPG is known for its inherent instabilities, particularly its susceptibility to Q-value overestimation and challenges in sparse reward environments, which can hinder performance and efficiency. To address these limitations, this paper proposes a Hybrid Reward-Shaped DDPG and PSO fusion algorithm(HR-DP). This method integrates a multi-component reward function with the PSO algorithm, aiming to guide the agent's exploration more effectively and higher exploration accuracy. The target point tracking experiment by the two-segment soft continuum robot in simulation was performed. The results demonstrate that the robot controlled by HR-DP can achieve significantly faster convergence, higher cumulative rewards, higher accuracy and greater stability than that by the standard DDPG baseline and other state-of-theart methods. This work provides a practical and effective HR-DP algorithm enhanced from DDPG, which can benefit pose control of the multi-segment soft continuum robot in complex, real-world environment.
AB - Soft continuum robots have broad application prospects due to their superior flexibility, compliance and safety. However, the high mechanical complexity addresses a major challenge in precise pose control modeling. Deep Deterministic Policy Gradient (DDPG) algorithm, as a foundational modelfree, off-policy method, is well-suited for such tasks. However, DDPG is known for its inherent instabilities, particularly its susceptibility to Q-value overestimation and challenges in sparse reward environments, which can hinder performance and efficiency. To address these limitations, this paper proposes a Hybrid Reward-Shaped DDPG and PSO fusion algorithm(HR-DP). This method integrates a multi-component reward function with the PSO algorithm, aiming to guide the agent's exploration more effectively and higher exploration accuracy. The target point tracking experiment by the two-segment soft continuum robot in simulation was performed. The results demonstrate that the robot controlled by HR-DP can achieve significantly faster convergence, higher cumulative rewards, higher accuracy and greater stability than that by the standard DDPG baseline and other state-of-theart methods. This work provides a practical and effective HR-DP algorithm enhanced from DDPG, which can benefit pose control of the multi-segment soft continuum robot in complex, real-world environment.
KW - Actor-Critic
KW - DDPG
KW - Deep Reinforcement Learning
KW - Hybrid Reward
KW - PSO
KW - continuum robots
UR - https://www.scopus.com/pages/publications/105034667665
U2 - 10.1109/CloudCom67567.2025.11331512
DO - 10.1109/CloudCom67567.2025.11331512
M3 - Conference contribution
AN - SCOPUS:105034667665
T3 - Proceedings - 2025 IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2025
BT - Proceedings - 2025 IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE 16th International Conference on Cloud Computing Technology and Science, IEEE CloudCom 2025
Y2 - 14 November 2025 through 16 November 2025
ER -