TY - GEN
T1 - Multi-Agent Power and Resource Allocation for D2D Communications
T2 - 96th IEEE Vehicular Technology Conference, VTC 2022-Fall 2022
AU - Xiang, Honglin
AU - Peng, Jingyi
AU - Gao, Zhen
AU - Li, Lingjie
AU - Yang, Yang
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The explosion in the number of smartphones and wearable devices brings the challenge of high achievable rate (AR) requirement, and D2D communications become the critical technology to solve this challenge. However, the co-channel interference caused by spectrum reusing and low delay requirement restrict D2D communications performance improvements. In this paper, we consider the cases of no time delay constraint and time delay constraint respectively, and design a joint power control and resource allocation scheme based on deep reinforcement learning (DRL) to maximize the AR of cellular users (CUEs) and D2D users (DUEs). Specifically, D2D pairs are considered multiple agents for reusing CUE spectrum, each agent can independently select spectrum resources and power without any prior information to ease interference. Furthermore, a double deep Q-network with priority sampling (Pr-DDQN) distributed algorithm is proposed, which helps agents to learn more dominant features during experience replay. Simulation results indicate that Pr-DDQN algorithm can obtain a higher AR than the present DRL algorithms. In particular, the probability of selecting low power of agents enlarges as the increase of the remaining transmission time, which demonstrates that the agents can successfully learn and perceive the implicit relationship of time delay constraint.
AB - The explosion in the number of smartphones and wearable devices brings the challenge of high achievable rate (AR) requirement, and D2D communications become the critical technology to solve this challenge. However, the co-channel interference caused by spectrum reusing and low delay requirement restrict D2D communications performance improvements. In this paper, we consider the cases of no time delay constraint and time delay constraint respectively, and design a joint power control and resource allocation scheme based on deep reinforcement learning (DRL) to maximize the AR of cellular users (CUEs) and D2D users (DUEs). Specifically, D2D pairs are considered multiple agents for reusing CUE spectrum, each agent can independently select spectrum resources and power without any prior information to ease interference. Furthermore, a double deep Q-network with priority sampling (Pr-DDQN) distributed algorithm is proposed, which helps agents to learn more dominant features during experience replay. Simulation results indicate that Pr-DDQN algorithm can obtain a higher AR than the present DRL algorithms. In particular, the probability of selecting low power of agents enlarges as the increase of the remaining transmission time, which demonstrates that the agents can successfully learn and perceive the implicit relationship of time delay constraint.
KW - Device-to-device communications
KW - deep reinforcement learning
KW - power control
KW - resource allocation
UR - http://www.scopus.com/inward/record.url?scp=85146987937&partnerID=8YFLogxK
U2 - 10.1109/VTC2022-Fall57202.2022.10012889
DO - 10.1109/VTC2022-Fall57202.2022.10012889
M3 - Conference contribution
AN - SCOPUS:85146987937
T3 - IEEE Vehicular Technology Conference
BT - 2022 IEEE 96th Vehicular Technology Conference, VTC 2022-Fall 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 September 2022 through 29 September 2022
ER -