TY - GEN
T1 - Research on Navigation Algorithm of Unmanned Ground Vehicle Based on Imitation Learning and Curiosity Driven
AU - Liu, Shiqi
AU - Chen, Jiawei
AU - Zu, Bowen
AU - Zhou, Xuehua
AU - Zhou, Zhiguo
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2022
Y1 - 2022
N2 - The application of deep reinforcement learning (DRL) for autonomous navigation of unmanned ground vehicle (UGV) has the problem of sparse rewards, which makes the trained algorithm model difficult to converge and cannot be transferred to real vehicles. In this regard, this paper proposes an effective exploratory learning autonomous navigation algorithm Double I-PPO, which designs pre-training behaviors based on imitation learning (IL) to guide UGV to try positive states, and introduces the intrinsic curiosity module (ICM) to generate intrinsic reward signals to encourage exploratory learning strategies. Build the training scene in Unity to evaluate the performance of the algorithm, and integrate the algorithm strategy into the motion planning stack of the ROS vehicle, so as to extend to the actual scene for testing. Experiments show that in the environment of random obstacles, the method does not need to rely on prior map information. Compared with similar DRL algorithms, the convergence speed is faster and the navigation success rate can reach more than 85%.
AB - The application of deep reinforcement learning (DRL) for autonomous navigation of unmanned ground vehicle (UGV) has the problem of sparse rewards, which makes the trained algorithm model difficult to converge and cannot be transferred to real vehicles. In this regard, this paper proposes an effective exploratory learning autonomous navigation algorithm Double I-PPO, which designs pre-training behaviors based on imitation learning (IL) to guide UGV to try positive states, and introduces the intrinsic curiosity module (ICM) to generate intrinsic reward signals to encourage exploratory learning strategies. Build the training scene in Unity to evaluate the performance of the algorithm, and integrate the algorithm strategy into the motion planning stack of the ROS vehicle, so as to extend to the actual scene for testing. Experiments show that in the environment of random obstacles, the method does not need to rely on prior map information. Compared with similar DRL algorithms, the convergence speed is faster and the navigation success rate can reach more than 85%.
KW - Deep reinforcement learning
KW - Navigation
KW - ROS
KW - Spare reward
KW - Unity
KW - Unmanned ground vehicle
UR - http://www.scopus.com/inward/record.url?scp=85146654232&partnerID=8YFLogxK
U2 - 10.1007/978-981-19-9198-1_46
DO - 10.1007/978-981-19-9198-1_46
M3 - Conference contribution
AN - SCOPUS:85146654232
SN - 9789811991974
T3 - Communications in Computer and Information Science
SP - 609
EP - 621
BT - Methods and Applications for Modeling and Simulation of Complex Systems - 21st Asia Simulation Conference, AsiaSim 2022, Proceedings
A2 - Fan, Wenhui
A2 - Zhang, Lin
A2 - Li, Ni
A2 - Song, Xiao
PB - Springer Science and Business Media Deutschland GmbH
T2 - 21st Asia Simulation Conference, AsiaSim 2022
Y2 - 9 December 2022 through 11 December 2022
ER -