Heuristic-enhanced Proximal Policy Optimization Algorithm for Navigation

  • Yuhang Zhang
  • , Yanmin Liu
  • , Haikuo Liu*
  • , Yidian Huang
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The challenge of navigating unmanned aerial vehicles (UAVs) can be effectively tackled through the application of reinforcement learning (RL) methodologies. Nonetheless, the baseline Proximal Policy Optimization (PPO) algorithm faces significant hurdles in achieving efficient convergence, primarily due to the sparse nature of rewards associated with navigation tasks. Addressing this issue, this paper presents an enhanced approach by integrating heuristic exploration strategies into the PPO framework, leading to the development of the AS-PPO (Action Switching PPO) algorithm. Furthermore, the research introduces specifically tailored reward functions designed for navigation purposes. Empirical evidence from experimental outcomes confirms the viability and efficacy of the proposed ASPPO method, highlighting its superior performance in handling continuous action spaces within navigation tasks.

Original languageEnglish
Title of host publication2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331544706
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025 - Xi'an, China
Duration: 23 May 202525 May 2025

Publication series

Name2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025

Conference

Conference2025 Joint International Conference on Automation-Intelligence-Safety, ICAIS 2025 and International Symposium on Autonomous Systems, ISAS 2025
Country/TerritoryChina
CityXi'an
Period23/05/2525/05/25

Keywords

  • heuristic method
  • reinforcement Learning
  • UAV navigation

Fingerprint

Dive into the research topics of 'Heuristic-enhanced Proximal Policy Optimization Algorithm for Navigation'. Together they form a unique fingerprint.

Cite this