A Two-Layered Reinforcement Learning Framework for AoI-Aware Trajectory Planning and Scheduling Optimization in Multi-UAV Networks

  • Kang Fu
  • , Qingjie Zhao*
  • , Lei Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Unmanned aerial vehicles (UAVs) have emerged as an effective solution for data collection in Internet of Things (IoT) networks. To maintain data freshness, the age of information (AoI) has become a key performance metric, which is jointly influenced by UAV trajectory planning and sensor node (SN) scheduling. However, optimizing these two interdependent tasks simultaneously leads to high-dimensional decision spaces and unstable learning dynamics. To solve this problem, we propose a two-layered reinforcement learning framework for AoI-aware trajectory planning and scheduling optimization, named TL-RATS. In the upper layer, a reinforcement learning module is designed to learn long-term UAV trajectories by using the agent-by-agent policy optimization (A2PO) algorithm, enhanced by sequential updates and preceding-agent off-policy correction (PreOPC) to ensure sample-efficient and stable learning. In the lower layer, we formulate the scheduling problem as a time-constrained 0-1 knapsack optimization, where each item's weight represents data collection and transmission time, and its value corresponds to potential AoI reduction. A lightweight dynamic programming (DP) algorithm is used to allocate transmission opportunities under time constraints. Extensive experiments under diverse SN distributions demonstrate that TL-RATS significantly reduces AoI and outperforms representative baselines, including MAPPO, IPPO, MAT, greedy scheduling, and fully joint policy. These results highlight the benefits of the proposed layered design and task-specific coordination.

Original languageEnglish
JournalIEEE Internet of Things Journal
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • Age of information
  • deep reinforcement learning
  • planning and scheduling optimization
  • unmanned aerial vehicles

Fingerprint

Dive into the research topics of 'A Two-Layered Reinforcement Learning Framework for AoI-Aware Trajectory Planning and Scheduling Optimization in Multi-UAV Networks'. Together they form a unique fingerprint.

Cite this