跳到主要导航 跳到搜索 跳到主要内容

Research on LSTM-PPO Obstacle Avoidance Algorithm and Training Environment for Unmanned Surface Vehicles

  • Wangbin Luo
  • , Xiang Wang
  • , Fang Han
  • , Zhiguo Zhou*
  • , Junyu Cai
  • , Lin Zeng
  • , Hong Chen
  • , Jiawei Chen
  • , Xuehua Zhou
  • *此作品的通讯作者
  • Ltd.
  • Beijing Institute of Technology
  • Guangzhou Customs District Technology Center

科研成果: 期刊稿件文章同行评审

摘要

The current unmanned surface vehicle (USV) intelligent obstacle avoidance algorithm based on deep reinforcement learning usually adopts the mass point model to train in an ideal environment. However, in actual navigation, due to the influence of the ship model and the water surface environment, the training set is triggered. The reward function does not match the actual situation, resulting in a poor obstacle avoidance effect. In response to the above problems, this paper proposes a long and short memory network-proximal strategy optimization (LSTM-PPO) intelligent obstacle avoidance algorithm for non-particle models in non-ideal environments, and designs a corresponding deep reinforcement learning training environment. We integrate the motion characteristics of the unmanned boat and the influencing factors of the surface environment, based on the curiosity-driven set reward function, to improve its autonomous obstacle avoidance ability, combined with the LSTM network to identify and save obstacle information to improve the adaptability to the unknown environment; virtual simulation is performed in Unity. The engine builds a USV physical model and a refined water deep reinforcement learning training environment including a variety of obstacle models. The experimental results demonstrate that the LSTM-PPO algorithm exhibits an effective and rational obstacle avoidance effect, with a success rate of 86.7%, an average path length of 198.52 m, and a convergence time of 1.5 h. A comparison with the performance of three other deep reinforcement learning algorithms reveals that the LSTM-PPO algorithm exhibits a 21.5% reduction in average convergence time, an 18.5% reduction in average path length, and an approximately 20% enhancement in the success rate of obstacle avoidance in complex environments. These results indicate that the LSTM-PPO algorithm can effectively enhance the search efficiency and optimize the path planning in obstacle avoidance for unmanned boats, rendering it more rational.

源语言英语
文章编号479
期刊Journal of Marine Science and Engineering
13
3
DOI
出版状态已出版 - 3月 2025

指纹

探究 'Research on LSTM-PPO Obstacle Avoidance Algorithm and Training Environment for Unmanned Surface Vehicles' 的科研主题。它们共同构成独一无二的指纹。

引用此