Abstract
With the proliferation of software-defined radio technology, malicious jamming attacks against wireless communications have become more aggressive and flexible, which could easily create a complex and highly dynamic jamming environment by varying both the jamming parameters and the jamming policies. Such a complex jamming environment makes it challenging for most of deep reinforcement learning (DRL) based anti-jamming schemes in rapidly identifying effective strategies. In this paper, we have developed a dual-tier policy-oriented anti-jamming (DPA) scheme based on DRL to facilitate swift adaptation to the complex jamming environment. Unlike existing works, an upper-tier jamming pattern recognition (JPR) network is introduced to extract underlying jamming policy-related information which serves as a guidance for the lower-tier deep recurrent Q-network on anti-jamming decision-making. The output of the JPR network can enable the sharing of experiences among various jamming patterns originated from the same jamming policy and facilitate more efficient and targeted anti-jamming strategic learning. Extensive experimental results demonstrate that the superiority of our DPA scheme over other DRL-based benchmark schemes in terms of both anti-jamming performance and convergence speed.
| Original language | English |
|---|---|
| Pages (from-to) | 10652-10668 |
| Number of pages | 17 |
| Journal | IEEE Transactions on Wireless Communications |
| Volume | 25 |
| DOIs | |
| Publication status | Published - 2026 |
| Externally published | Yes |
Keywords
- anti-jamming communication
- cognitive radio network
- deep recurrent Q-network
- Deep reinforcement learning
- dynamic spectrum access
Fingerprint
Dive into the research topics of 'A Dual-Tier Policy-Oriented Anti-Jamming Scheme Based on Deep Reinforcement Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver