TY - GEN
T1 - Reinforcement-learning-based path planning for UAVs in intensive obstacle environment
AU - Guo, Miao
AU - Long, Teng
AU - Li, Hui
AU - Sun, Jingliang
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - In intensive obstacle environment, the available flying space is narrow, which makes it difficult to generate feasible path for UAVs within limited runtime. In this paper, a Q-learning-based planning algorithm is presented to improve the efficiency of single UAV path planning in intensive obstacle environment. By constructing the space-action state offline learning planning architecture, the proposed method realizes the rapid path planning of UAV, and solves the high time-consuming problem of reinforcement learning online path planning. Considering the time-consuming problem of Q-table re-training, a probabilistic local update mechanism is proposed by updating the Q-value of the states to reduce the high time-consuming of Q-table re-raining and realize the rapid update of Q-table. The probability of Q-value updating is up to the distance to the new obstacle. The closer the state is to the new obstacle, the higher its probability of re-training. Therefore, the flight trajectory can be quickly re-planned when the environment changes. Simulation results show that the proposed Q-learning-based planning algorithm can generate path for UAV from random start position and avoid the obstacles. Compared with the classical A* algorithm, the path planning time based on the trained Q table can be reduced from second to millisecond, which significantly improves the efficiency of path planning.
AB - In intensive obstacle environment, the available flying space is narrow, which makes it difficult to generate feasible path for UAVs within limited runtime. In this paper, a Q-learning-based planning algorithm is presented to improve the efficiency of single UAV path planning in intensive obstacle environment. By constructing the space-action state offline learning planning architecture, the proposed method realizes the rapid path planning of UAV, and solves the high time-consuming problem of reinforcement learning online path planning. Considering the time-consuming problem of Q-table re-training, a probabilistic local update mechanism is proposed by updating the Q-value of the states to reduce the high time-consuming of Q-table re-raining and realize the rapid update of Q-table. The probability of Q-value updating is up to the distance to the new obstacle. The closer the state is to the new obstacle, the higher its probability of re-training. Therefore, the flight trajectory can be quickly re-planned when the environment changes. Simulation results show that the proposed Q-learning-based planning algorithm can generate path for UAV from random start position and avoid the obstacles. Compared with the classical A* algorithm, the path planning time based on the trained Q table can be reduced from second to millisecond, which significantly improves the efficiency of path planning.
KW - Q-learning
KW - UAV
KW - offline training
KW - path planning
KW - probabilistic local update mechanism
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85128041443&partnerID=8YFLogxK
U2 - 10.1109/CAC53003.2021.9727746
DO - 10.1109/CAC53003.2021.9727746
M3 - Conference contribution
AN - SCOPUS:85128041443
T3 - Proceeding - 2021 China Automation Congress, CAC 2021
SP - 6451
EP - 6455
BT - Proceeding - 2021 China Automation Congress, CAC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 China Automation Congress, CAC 2021
Y2 - 22 October 2021 through 24 October 2021
ER -