TY - GEN
T1 - Continuous control for moving object tracking of unmanned skid-steered vehicle based on reinforcement learning
AU - Li, Zheng
AU - Zhou, Junjie
AU - Li, Xueyuan
AU - Du, Xu
AU - Wang, Lei
AU - Wang, Yun
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11/27
Y1 - 2020/11/27
N2 - Skid Steering vehicles are being widely used due to their robust mechanical structure and high maneuverability. Moving object tracking for unmanned skid-steered vehicle (USSV) is a challenging task that requires delicate actions to ensure a smooth trajectory and accurate response between ego vehicle and the moving object. However, inevitable slipping and sliding of the tire that makes the vehicle difficult to control and accurate model of USSV are hard to describe. This paper proposes a real-time moving object tracking system with continuous actions for USSV base on a reinforcement learning algorithm named Twin Delay Deterministic Policy Gradient (TD3). The capacity of the replay buffer, which is critical in the training process, changes softly as the training episodes increases. We added two control group models with a fixed capacity of replay buffer and trained the RL agent from scratch in the gazebo environment. By observing the training and validation results, we can conclude that our RL model performs well for moving target tracking, and the model with soft updated replay buffer has high efficiency in the training process and high accuracy in the evaluation process.
AB - Skid Steering vehicles are being widely used due to their robust mechanical structure and high maneuverability. Moving object tracking for unmanned skid-steered vehicle (USSV) is a challenging task that requires delicate actions to ensure a smooth trajectory and accurate response between ego vehicle and the moving object. However, inevitable slipping and sliding of the tire that makes the vehicle difficult to control and accurate model of USSV are hard to describe. This paper proposes a real-time moving object tracking system with continuous actions for USSV base on a reinforcement learning algorithm named Twin Delay Deterministic Policy Gradient (TD3). The capacity of the replay buffer, which is critical in the training process, changes softly as the training episodes increases. We added two control group models with a fixed capacity of replay buffer and trained the RL agent from scratch in the gazebo environment. By observing the training and validation results, we can conclude that our RL model performs well for moving target tracking, and the model with soft updated replay buffer has high efficiency in the training process and high accuracy in the evaluation process.
KW - Continuous control
KW - Reinforcement learning
KW - USSV object tracking
KW - Unmanned skid-steered vehicle
UR - http://www.scopus.com/inward/record.url?scp=85098982009&partnerID=8YFLogxK
U2 - 10.1109/ICUS50048.2020.9274962
DO - 10.1109/ICUS50048.2020.9274962
M3 - Conference contribution
AN - SCOPUS:85098982009
T3 - Proceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020
SP - 456
EP - 461
BT - Proceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference on Unmanned Systems, ICUS 2020
Y2 - 27 November 2020 through 28 November 2020
ER -