Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming

Zefang Lv; Liang Xiao; Yousong Du; Guohang Niu; Chengwen Xing; Wenyuan Xu

doi:10.1109/TWC.2023.3268082

Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming

Zefang Lv, Liang Xiao^*, Yousong Du, Guohang Niu, Chengwen Xing, Wenyuan Xu

^*Corresponding author for this work

School of Information and Electronics

Research output: Contribution to journal › Article › peer-review

19 Citations (Scopus)

Abstract

The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.

Original language	English
Pages (from-to)	9063-9075
Number of pages	13
Journal	IEEE Transactions on Wireless Communications
Volume	22
Issue number	12
DOIs	https://doi.org/10.1109/TWC.2023.3268082
Publication status	Published - 1 Dec 2023

Keywords

Unmanned aerial vehicles
jamming
reinforcement learning
swarm communications

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/TWC.2023.3268082

Cite this

@article{71e0def1ca974eaf9a19b5bc81472e0e,

title = "Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming",

abstract = "The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.",

keywords = "Unmanned aerial vehicles, jamming, reinforcement learning, swarm communications",

author = "Zefang Lv and Liang Xiao and Yousong Du and Guohang Niu and Chengwen Xing and Wenyuan Xu",

note = "Publisher Copyright: {\textcopyright} 2002-2012 IEEE.",

year = "2023",

month = dec,

day = "1",

doi = "10.1109/TWC.2023.3268082",

language = "English",

volume = "22",

pages = "9063--9075",

journal = "IEEE Transactions on Wireless Communications",

issn = "1536-1276",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "12",

}

TY - JOUR

T1 - Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming

AU - Lv, Zefang

AU - Xiao, Liang

AU - Du, Yousong

AU - Niu, Guohang

AU - Xing, Chengwen

AU - Xu, Wenyuan

PY - 2023/12/1

Y1 - 2023/12/1

N2 - The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.

AB - The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.

KW - Unmanned aerial vehicles

KW - jamming

KW - reinforcement learning

KW - swarm communications

UR - http://www.scopus.com/inward/record.url?scp=85159722023&partnerID=8YFLogxK

U2 - 10.1109/TWC.2023.3268082

DO - 10.1109/TWC.2023.3268082

M3 - Article

AN - SCOPUS:85159722023

SN - 1536-1276

VL - 22

SP - 9063

EP - 9075

JO - IEEE Transactions on Wireless Communications

JF - IEEE Transactions on Wireless Communications

IS - 12

ER -

Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this