TY - JOUR
T1 - Multi-Agent Reinforcement Learning Based UAV Swarm Communications Against Jamming
AU - Lv, Zefang
AU - Xiao, Liang
AU - Du, Yousong
AU - Niu, Guohang
AU - Xing, Chengwen
AU - Xu, Wenyuan
N1 - Publisher Copyright:
© 2002-2012 IEEE.
PY - 2023/12/1
Y1 - 2023/12/1
N2 - The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.
AB - The swarm relay and power allocation policy determines the bit error rate and the energy consumption of unmanned aerial vehicles (UAVs) and can be optimized based on the network and jamming model, which is rarely known by UAVs. In this paper, we propose a multi-agent reinforcement learning (RL)-based UAV swarm communication scheme to optimize the relay selection and power allocation against jamming. Based on the network topology, channel states, previous performance and observations shared by the neighboring UAVs, this scheme formulates the policy distribution to improve the policy exploration and applies a policy learning mechanism to stabilize the learning process. Based on transfer learning, the shared swarm experiences are exploited to accelerate the initial learning and improve policy optimization. A deep RL-based scheme is proposed to mitigate the state quantization error for the rapidly changing channel states under high swarm moving speed and thus further improve the anti-jamming performance. This scheme designs a policy network with four fully connected layers to approximate the policy distribution and uses another two neural networks to estimate the average policy distribution and the expected long-term utility, respectively, to update the policy network for stabilized deep learning. We investigate the computational complexity and derive the performance bound regarding the bit error rate, the energy consumption and the utility. Simulation and experimental results verify the performance gain of our proposed schemes over related works.
KW - Unmanned aerial vehicles
KW - jamming
KW - reinforcement learning
KW - swarm communications
UR - http://www.scopus.com/inward/record.url?scp=85159722023&partnerID=8YFLogxK
U2 - 10.1109/TWC.2023.3268082
DO - 10.1109/TWC.2023.3268082
M3 - Article
AN - SCOPUS:85159722023
SN - 1536-1276
VL - 22
SP - 9063
EP - 9075
JO - IEEE Transactions on Wireless Communications
JF - IEEE Transactions on Wireless Communications
IS - 12
ER -