TY - GEN
T1 - Heuristic Dual Q-Learning Based Radar Anti-Jamming Decision-Making
AU - Xiang, Hongwu
AU - Bai, Zhiquan
AU - Chen, Yang
AU - Li, Na
AU - Hao, Xinhong
AU - Dai, Jian
AU - Yan, Xiaopeng
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - With the increasing dynamics and complexity of the modern electronic countermeasure environment, the traditional anti-jamming decision-making methods are difficult to meet the real-time and intelligent demands of cognitive radar and some reinforcement learning algorithms have been designed and used in radar anti-jamming decision-making. However, aiming at the overestimation of Q-value and instability of training results for typical Q-learning, this paper proposes a collaborative optimization based on dual Q-tables and heuristic function for antijamming decision-making in cognitive radar system. It separates the action selection and value evaluation process with the help of two independent Q-tables, thus suppressing the valuation bias caused by the common single Q-table. Meanwhile, the heuristic function is dynamically designed to optimize the exploration process based on the optimal action and reward. Simulation results show that the proposed algorithm improves the decisionmaking accuracy by 16% on average compared with the popular Q-learning and 'State-Action-Reward-State-Action' (Sarsa) methods, and the stability of strategy selection is significantly enhanced, providing a reliable solution for the real-time decisionmaking of cognitive radar system in complex electromagnetic environment.
AB - With the increasing dynamics and complexity of the modern electronic countermeasure environment, the traditional anti-jamming decision-making methods are difficult to meet the real-time and intelligent demands of cognitive radar and some reinforcement learning algorithms have been designed and used in radar anti-jamming decision-making. However, aiming at the overestimation of Q-value and instability of training results for typical Q-learning, this paper proposes a collaborative optimization based on dual Q-tables and heuristic function for antijamming decision-making in cognitive radar system. It separates the action selection and value evaluation process with the help of two independent Q-tables, thus suppressing the valuation bias caused by the common single Q-table. Meanwhile, the heuristic function is dynamically designed to optimize the exploration process based on the optimal action and reward. Simulation results show that the proposed algorithm improves the decisionmaking accuracy by 16% on average compared with the popular Q-learning and 'State-Action-Reward-State-Action' (Sarsa) methods, and the stability of strategy selection is significantly enhanced, providing a reliable solution for the real-time decisionmaking of cognitive radar system in complex electromagnetic environment.
KW - Markov decision process
KW - Reinforcement learning
KW - anti-jamming decision-making
KW - cognitive radar
UR - https://www.scopus.com/pages/publications/105034129328
U2 - 10.1109/ICCT67417.2025.11374245
DO - 10.1109/ICCT67417.2025.11374245
M3 - Conference contribution
AN - SCOPUS:105034129328
T3 - International Conference on Communication Technology Proceedings, ICCT
SP - 1726
EP - 1730
BT - 2025 IEEE 25th International Conference on Communication Technology, ICCT 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 25th IEEE International Conference on Communication Technology, ICCT 2025
Y2 - 16 October 2025 through 18 October 2025
ER -