TY - GEN
T1 - Heuristic-enhanced Proximal Policy Optimization Algorithm for Navigation
AU - Zhang, Yuhang
AU - Liu, Yanmin
AU - Liu, Haikuo
AU - Huang, Yidian
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - The challenge of navigating unmanned aerial vehicles (UAVs) can be effectively tackled through the application of reinforcement learning (RL) methodologies. Nonetheless, the baseline Proximal Policy Optimization (PPO) algorithm faces significant hurdles in achieving efficient convergence, primarily due to the sparse nature of rewards associated with navigation tasks. Addressing this issue, this paper presents an enhanced approach by integrating heuristic exploration strategies into the PPO framework, leading to the development of the AS-PPO (Action Switching PPO) algorithm. Furthermore, the research introduces specifically tailored reward functions designed for navigation purposes. Empirical evidence from experimental outcomes confirms the viability and efficacy of the proposed ASPPO method, highlighting its superior performance in handling continuous action spaces within navigation tasks.
AB - The challenge of navigating unmanned aerial vehicles (UAVs) can be effectively tackled through the application of reinforcement learning (RL) methodologies. Nonetheless, the baseline Proximal Policy Optimization (PPO) algorithm faces significant hurdles in achieving efficient convergence, primarily due to the sparse nature of rewards associated with navigation tasks. Addressing this issue, this paper presents an enhanced approach by integrating heuristic exploration strategies into the PPO framework, leading to the development of the AS-PPO (Action Switching PPO) algorithm. Furthermore, the research introduces specifically tailored reward functions designed for navigation purposes. Empirical evidence from experimental outcomes confirms the viability and efficacy of the proposed ASPPO method, highlighting its superior performance in handling continuous action spaces within navigation tasks.
KW - heuristic method
KW - reinforcement Learning
KW - UAV navigation
UR - https://www.scopus.com/pages/publications/105012111676
U2 - 10.1109/ICAISISAS64483.2025.11051844
DO - 10.1109/ICAISISAS64483.2025.11051844
M3 - Conference contribution
AN - SCOPUS:105012111676
T3 - 2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025
BT - 2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025
Y2 - 23 May 2025 through 25 May 2025
ER -