Abstract
Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet-of-Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This paper investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources with cellular users (CUEs) in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.
Original language | English |
---|---|
Journal | IEEE Internet of Things Journal |
DOIs | |
Publication status | Accepted/In press - 2022 |
Externally published | Yes |
Keywords
- D2D communication
- Deep reinforcement learning
- Device-to-device communication
- Internet of Things
- Internet-of-Things.
- Linear programming
- Power control
- Resource management
- Throughput
- Uplink
- dynamic spectrum access