基于深度确定性梯度学习的集群多目标分配方法

Translated title of the contribution: Research on Multi-Target Assignment Method for Clusters Based on Deep Deterministic Policy Gradient Learning

Qiaoyi Li, Zhengjie Wang*, Xiaoning Zhang, Qiyuan Cheng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In the target assignment for multi-missile cooperative operations, there exists uncertainty in the number and variety of enemy platforms and anti-ship missiles, which makes it difficult to model the target assignment algorithm. To improve the effectiveness of attacks under high-dynamic collaborative attack conditions, a dynamic battlefield environment model and a single-round Markov decision model for multi-target assignment were established. An improved deep deterministic policy gradient (DDPG) assignment algorithm was proposed to automatically find the optimal allocation strategy through interaction with the simulator. The algorithm uses the mask method to mask the action space and adapt to the number and type of platforms. The simulation results show that under different defense configurations and configurations of red and blue sides, the performance improvement of the attack strategy obtained by the algorithm was about 87.5% compared with that of the random strategy, and the reasoning time of the model was about 0.04 ms. This research will accelerate the application of DDPG-based methods in intelligent decision-making in high-dynamic environments, and promote the research on cluster autonomous decision-making methods.

Translated title of the contributionResearch on Multi-Target Assignment Method for Clusters Based on Deep Deterministic Policy Gradient Learning
Original languageChinese (Traditional)
Pages (from-to)1051-1057
Number of pages7
JournalBeijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
Volume44
Issue number10
DOIs
Publication statusPublished - Oct 2024

Fingerprint

Dive into the research topics of 'Research on Multi-Target Assignment Method for Clusters Based on Deep Deterministic Policy Gradient Learning'. Together they form a unique fingerprint.

Cite this