Latent-Maximum-Entropy-Based Cognitive Radar Reward Function Estimation With Nonideal Observations

Luyao Zhang, Mengtao Zhu*, Jiahao Qin, Yunjie Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The concept of "inverse cognition"has recently emerged and has garnered significant research attention in the radar community from aspects of inverse filtering, inverse cognitive radar (I-CR), and designing smart interference for counter-adversarial autonomous systems (i.e., the cognitive radar). For instance, identifying whether an adversary cognitive radar's actions (such as waveform selection and beam scheduling) are consistent with the constrained utility maximization and if so, estimating the utility function has led to recent formulations of I-CR. In this context of I-CR, we address the challenges of estimating unknown and complex utility functions with nonideal action observations. We mean nonideal by missing and nonoptimal action observations. In this article, we assume that the adversary CR is optimizing its action policy by maximizing some forms of the expected utility function with unknown and complex structures over long time horizons. We then designed an IRL method under nonideal observations and illustrated the applicability of the methods. The nonideal factors are treated as latent variables, and the I-CR problem is formulated as a latent information inference problem. Then, an expectation-maximization (EM)-based algorithm is developed to iteratively solve the problem with nonconvex and nonlinear optimizations through a Lagrangian relaxation reformulation. The performance of the proposed method is evaluated and compared utilizing simulated CR target tracking scenarios with Markov decision process (MDP) and partially observable MDP settings. Experimental results verified the robustness, effectiveness, and superiority of the proposed method.

Original languageEnglish
Pages (from-to)6656-6670
Number of pages15
JournalIEEE Transactions on Aerospace and Electronic Systems
Volume60
Issue number5
DOIs
Publication statusPublished - 2024

Keywords

  • Cognitive radar (CR)
  • expectation-maximization (EM)
  • inverse cognition
  • inverse reinforcement learning
  • latent maximum entropy (LME)

Fingerprint

Dive into the research topics of 'Latent-Maximum-Entropy-Based Cognitive Radar Reward Function Estimation With Nonideal Observations'. Together they form a unique fingerprint.

Cite this