Abstract
Uncrewed aerial vehicles (UAVs) have emerged as an effective solution for data collection in Internet of Things (IoT) networks. To maintain data freshness, the age of information (AoI) has become a key performance metric, which is jointly influenced by UAV trajectory planning and sensor node (SN) scheduling. However, optimizing these two interdependent tasks simultaneously leads to high-dimensional decision spaces and unstable learning dynamics. To solve this problem, we propose a two-layered reinforcement learning framework for AoI-aware trajectory planning and scheduling optimization, named TL-RATS. In the upper layer, a reinforcement learning module is designed to learn long-term UAV trajectories by using the agent-by-agent policy optimization (A2PO) algorithm, enhanced by sequential updates and preceding-agent off-policy correction (PreOPC) to ensure sample-efficient and stable learning. In the lower layer, we formulate the scheduling problem as a time-constrained 0–1 knapsack optimization, where each item’s weight represents data collection and transmission time, and its value corresponds to potential AoI reduction. A lightweight dynamic programming (DP) algorithm is used to allocate transmission opportunities under time constraints. Extensive experiments under diverse SN distributions demonstrate that TL-RATS significantly reduces AoI and outperforms representative baselines, including MAPPO, IPPO, MAT, greedy scheduling, and fully joint policy. These results highlight the benefits of the proposed layered design and task-specific coordination.
| Original language | English |
|---|---|
| Pages (from-to) | 4668-4682 |
| Number of pages | 15 |
| Journal | IEEE Internet of Things Journal |
| Volume | 13 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Feb 2026 |
| Externally published | Yes |
Keywords
- Age of information (AoI)
- deep reinforcement learning (DRL)
- planning and scheduling optimization
- uncrewed aerial vehicles (UAVs)
Fingerprint
Dive into the research topics of 'A Two-Layered Reinforcement Learning Framework for AoI-Aware Trajectory Planning and Scheduling Optimization in Multi-UAV Networks'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver