Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation

Xinyu Zhu, Yang Huang*, Shaoyu Wang, Qihui Wu, Xiaohu Ge, Yuan Liu, Zhen Gao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

This letter addresses the spectrum anti-jamming problem with multiple Internet of Things (IoT) devices for uplink transmissions, where policies for configuring frequency-domain channels have to be learned without the knowledge of the time-frequency distribution of the interference. The problem of decision-making or learning is expected to be solved by reinforcement learning (RL) approaches. However, the state-of-the-art RL-based spectrum anti-jamming methods may not be applicable in IoT systems, suffer from high computational complexity or may converge to a policy that may not be the best for each user. Therefore, we propose a novel spectrum anti-jamming scheme where configuration policies for the IoT devices are sequentially optimized with value function approximation-based multi-agent RL. Simulation results show that our proposed algorithm outperforms various baselines in terms of average normalized throughput.

Original languageEnglish
Pages (from-to)386-390
Number of pages5
JournalIEEE Wireless Communications Letters
Volume12
Issue number2
DOIs
Publication statusPublished - 1 Feb 2023

Keywords

  • Internet of Things
  • Markov decision process
  • Uplink transmissions
  • anti-jamming
  • reinforcement learning

Fingerprint

Dive into the research topics of 'Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation'. Together they form a unique fingerprint.

Cite this