跳到主要导航 跳到搜索 跳到主要内容

A Two-Layered Reinforcement Learning Framework for AoI-Aware Trajectory Planning and Scheduling Optimization in Multi-UAV Networks

科研成果: 期刊稿件文章同行评审

摘要

Uncrewed aerial vehicles (UAVs) have emerged as an effective solution for data collection in Internet of Things (IoT) networks. To maintain data freshness, the age of information (AoI) has become a key performance metric, which is jointly influenced by UAV trajectory planning and sensor node (SN) scheduling. However, optimizing these two interdependent tasks simultaneously leads to high-dimensional decision spaces and unstable learning dynamics. To solve this problem, we propose a two-layered reinforcement learning framework for AoI-aware trajectory planning and scheduling optimization, named TL-RATS. In the upper layer, a reinforcement learning module is designed to learn long-term UAV trajectories by using the agent-by-agent policy optimization (A2PO) algorithm, enhanced by sequential updates and preceding-agent off-policy correction (PreOPC) to ensure sample-efficient and stable learning. In the lower layer, we formulate the scheduling problem as a time-constrained 0–1 knapsack optimization, where each item’s weight represents data collection and transmission time, and its value corresponds to potential AoI reduction. A lightweight dynamic programming (DP) algorithm is used to allocate transmission opportunities under time constraints. Extensive experiments under diverse SN distributions demonstrate that TL-RATS significantly reduces AoI and outperforms representative baselines, including MAPPO, IPPO, MAT, greedy scheduling, and fully joint policy. These results highlight the benefits of the proposed layered design and task-specific coordination.

源语言英语
页(从-至)4668-4682
页数15
期刊IEEE Internet of Things Journal
13
3
DOI
出版状态已出版 - 2月 2026
已对外发布

指纹

探究 'A Two-Layered Reinforcement Learning Framework for AoI-Aware Trajectory Planning and Scheduling Optimization in Multi-UAV Networks' 的科研主题。它们共同构成独一无二的指纹。

引用此