Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach

Jingfei Huang; Yang Yang; Zhen Gao; Dazhong He; Derrick Wing Kwan Ng

doi:10.1109/JIOT.2022.3160197

Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach

Jingfei Huang, Yang Yang, Zhen Gao, Dazhong He, Derrick Wing Kwan Ng

Research output: Contribution to journal › Article › peer-review

16 Citations (Scopus)

Abstract

Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet-of-Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This paper investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources with cellular users (CUEs) in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.

Original language	English
Journal	IEEE Internet of Things Journal
DOIs	https://doi.org/10.1109/JIOT.2022.3160197
Publication status	Accepted/In press - 2022
Externally published	Yes

Keywords

D2D communication
Deep reinforcement learning
Device-to-device communication
Internet of Things
Internet-of-Things.
Linear programming
Power control
Resource management
Throughput
Uplink
dynamic spectrum access

Access to Document

10.1109/JIOT.2022.3160197

Cite this

Huang, J., Yang, Y., Gao, Z., He, D., & Ng, D. W. K. (Accepted/In press). Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach. IEEE Internet of Things Journal. https://doi.org/10.1109/JIOT.2022.3160197

@article{899a63409fa44353adfba592e1e04e98,

title = "Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach",

abstract = "Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet-of-Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This paper investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources with cellular users (CUEs) in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.",

keywords = "D2D communication, Deep reinforcement learning, Device-to-device communication, Internet of Things, Internet-of-Things., Linear programming, Power control, Resource management, Throughput, Uplink, dynamic spectrum access",

author = "Jingfei Huang and Yang Yang and Zhen Gao and Dazhong He and Ng, {Derrick Wing Kwan}",

note = "Publisher Copyright: IEEE",

year = "2022",

doi = "10.1109/JIOT.2022.3160197",

language = "English",

journal = "IEEE Internet of Things Journal",

issn = "2327-4662",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Dynamic Spectrum Access for D2D-Enabled Internet-of-Things

T2 - A Deep Reinforcement Learning Approach

AU - Huang, Jingfei

AU - Yang, Yang

AU - Gao, Zhen

AU - He, Dazhong

AU - Ng, Derrick Wing Kwan

N1 - Publisher Copyright: IEEE

PY - 2022

Y1 - 2022

N2 - Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet-of-Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This paper investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources with cellular users (CUEs) in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.

AB - Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet-of-Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This paper investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources with cellular users (CUEs) in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.

KW - D2D communication

KW - Deep reinforcement learning

KW - Device-to-device communication

KW - Internet of Things

KW - Internet-of-Things.

KW - Linear programming

KW - Power control

KW - Resource management

KW - Throughput

KW - Uplink

KW - dynamic spectrum access

UR - http://www.scopus.com/inward/record.url?scp=85126718963&partnerID=8YFLogxK

U2 - 10.1109/JIOT.2022.3160197

DO - 10.1109/JIOT.2022.3160197

M3 - Article

AN - SCOPUS:85126718963

SN - 2327-4662

JO - IEEE Internet of Things Journal

JF - IEEE Internet of Things Journal

ER -

Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this