TY - JOUR
T1 - Latent-Maximum-Entropy-Based Cognitive Radar Reward Function Estimation With Nonideal Observations
AU - Zhang, Luyao
AU - Zhu, Mengtao
AU - Qin, Jiahao
AU - Li, Yunjie
N1 - Publisher Copyright:
© 1965-2011 IEEE.
PY - 2024
Y1 - 2024
N2 - The concept of "inverse cognition"has recently emerged and has garnered significant research attention in the radar community from aspects of inverse filtering, inverse cognitive radar (I-CR), and designing smart interference for counter-adversarial autonomous systems (i.e., the cognitive radar). For instance, identifying whether an adversary cognitive radar's actions (such as waveform selection and beam scheduling) are consistent with the constrained utility maximization and if so, estimating the utility function has led to recent formulations of I-CR. In this context of I-CR, we address the challenges of estimating unknown and complex utility functions with nonideal action observations. We mean nonideal by missing and nonoptimal action observations. In this article, we assume that the adversary CR is optimizing its action policy by maximizing some forms of the expected utility function with unknown and complex structures over long time horizons. We then designed an IRL method under nonideal observations and illustrated the applicability of the methods. The nonideal factors are treated as latent variables, and the I-CR problem is formulated as a latent information inference problem. Then, an expectation-maximization (EM)-based algorithm is developed to iteratively solve the problem with nonconvex and nonlinear optimizations through a Lagrangian relaxation reformulation. The performance of the proposed method is evaluated and compared utilizing simulated CR target tracking scenarios with Markov decision process (MDP) and partially observable MDP settings. Experimental results verified the robustness, effectiveness, and superiority of the proposed method.
AB - The concept of "inverse cognition"has recently emerged and has garnered significant research attention in the radar community from aspects of inverse filtering, inverse cognitive radar (I-CR), and designing smart interference for counter-adversarial autonomous systems (i.e., the cognitive radar). For instance, identifying whether an adversary cognitive radar's actions (such as waveform selection and beam scheduling) are consistent with the constrained utility maximization and if so, estimating the utility function has led to recent formulations of I-CR. In this context of I-CR, we address the challenges of estimating unknown and complex utility functions with nonideal action observations. We mean nonideal by missing and nonoptimal action observations. In this article, we assume that the adversary CR is optimizing its action policy by maximizing some forms of the expected utility function with unknown and complex structures over long time horizons. We then designed an IRL method under nonideal observations and illustrated the applicability of the methods. The nonideal factors are treated as latent variables, and the I-CR problem is formulated as a latent information inference problem. Then, an expectation-maximization (EM)-based algorithm is developed to iteratively solve the problem with nonconvex and nonlinear optimizations through a Lagrangian relaxation reformulation. The performance of the proposed method is evaluated and compared utilizing simulated CR target tracking scenarios with Markov decision process (MDP) and partially observable MDP settings. Experimental results verified the robustness, effectiveness, and superiority of the proposed method.
KW - Cognitive radar (CR)
KW - expectation-maximization (EM)
KW - inverse cognition
KW - inverse reinforcement learning
KW - latent maximum entropy (LME)
UR - http://www.scopus.com/inward/record.url?scp=85194891302&partnerID=8YFLogxK
U2 - 10.1109/TAES.2024.3406671
DO - 10.1109/TAES.2024.3406671
M3 - Article
AN - SCOPUS:85194891302
SN - 0018-9251
VL - 60
SP - 6656
EP - 6670
JO - IEEE Transactions on Aerospace and Electronic Systems
JF - IEEE Transactions on Aerospace and Electronic Systems
IS - 5
ER -