Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation

Can Chen; Li Mo; Maolong Lv; Defu Lin; Tao Song; Jinde Cao

doi:10.1016/j.ast.2024.109285

Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation

Can Chen, Li Mo^*, Maolong Lv, Defu Lin, Tao Song, Jinde Cao

^*Corresponding author for this work

School of Aerospace Engineering

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

In recent years, autonomous decision-making has emerged as a critical technology in air-to-air confrontation scenarios, garnering significant attention. This paper presents a novel AI algorithm, the Missile Hit Probability Enhanced Actor-Critic (MHPAC), designed for autonomous decision-making in such confrontations, whose primary objective is to maximize the probability of defeating opponents while minimizing the risk of being shot down. By incorporating a pre-trained Missile Hit Probability (MHP) model into reward shaping and exploration within the framework of Reinforcement Learning (RL), the MHPAC algorithm enhances the learning capabilities of the Actor-Critic (AC) algorithm specifically tailored for air-to-air confrontation scenarios. Furthermore, the MHP model is also integrated into the confrontation strategy to inform missile launch decisions. Using the MHPAC algorithm, the confrontation strategy is achieved via the training strategy of curriculum learning and self-play learning. Results demonstrate that the MHPAC algorithm effectively explores efficient maneuvering strategies for missile launch and defense, overcoming challenges associated with sparse and delayed reward signals. The decision-making capabilities of the integrated maneuvering and missile launch strategy are significantly enhanced by the proposed MHPAC algorithm, with a relative win ratio of over 65% against different strategies. Moreover, the trained strategy only needs 0.039 s for real-time decision-making. This research holds considerable promise for achieving air superiority and mission success in complex and dynamic aerial environments.

Original language	English
Article number	109285
Journal	Aerospace Science and Technology
Volume	151
DOIs	https://doi.org/10.1016/j.ast.2024.109285
Publication status	Published - Aug 2024

Keywords

Actor-critic
Air-to-air confrontation
Autonomous decision-making
Missile hit probability
Reinforcement learning

Access to Document

10.1016/j.ast.2024.109285

Cite this

Chen, C., Mo, L., Lv, M., Lin, D., Song, T., & Cao, J. (2024). Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation. Aerospace Science and Technology, 151, Article 109285. https://doi.org/10.1016/j.ast.2024.109285

@article{f6b07f48111e42fda228e1ec025a2b7f,

title = "Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation",

abstract = "In recent years, autonomous decision-making has emerged as a critical technology in air-to-air confrontation scenarios, garnering significant attention. This paper presents a novel AI algorithm, the Missile Hit Probability Enhanced Actor-Critic (MHPAC), designed for autonomous decision-making in such confrontations, whose primary objective is to maximize the probability of defeating opponents while minimizing the risk of being shot down. By incorporating a pre-trained Missile Hit Probability (MHP) model into reward shaping and exploration within the framework of Reinforcement Learning (RL), the MHPAC algorithm enhances the learning capabilities of the Actor-Critic (AC) algorithm specifically tailored for air-to-air confrontation scenarios. Furthermore, the MHP model is also integrated into the confrontation strategy to inform missile launch decisions. Using the MHPAC algorithm, the confrontation strategy is achieved via the training strategy of curriculum learning and self-play learning. Results demonstrate that the MHPAC algorithm effectively explores efficient maneuvering strategies for missile launch and defense, overcoming challenges associated with sparse and delayed reward signals. The decision-making capabilities of the integrated maneuvering and missile launch strategy are significantly enhanced by the proposed MHPAC algorithm, with a relative win ratio of over 65% against different strategies. Moreover, the trained strategy only needs 0.039 s for real-time decision-making. This research holds considerable promise for achieving air superiority and mission success in complex and dynamic aerial environments.",

keywords = "Actor-critic, Air-to-air confrontation, Autonomous decision-making, Missile hit probability, Reinforcement learning",

author = "Can Chen and Li Mo and Maolong Lv and Defu Lin and Tao Song and Jinde Cao",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Masson SAS",

year = "2024",

month = aug,

doi = "10.1016/j.ast.2024.109285",

language = "English",

volume = "151",

journal = "Aerospace Science and Technology",

issn = "1270-9638",

publisher = "Elsevier Masson s.r.l.",

}

TY - JOUR

T1 - Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation

AU - Chen, Can

AU - Mo, Li

AU - Lv, Maolong

AU - Lin, Defu

AU - Song, Tao

AU - Cao, Jinde

PY - 2024/8

Y1 - 2024/8

N2 - In recent years, autonomous decision-making has emerged as a critical technology in air-to-air confrontation scenarios, garnering significant attention. This paper presents a novel AI algorithm, the Missile Hit Probability Enhanced Actor-Critic (MHPAC), designed for autonomous decision-making in such confrontations, whose primary objective is to maximize the probability of defeating opponents while minimizing the risk of being shot down. By incorporating a pre-trained Missile Hit Probability (MHP) model into reward shaping and exploration within the framework of Reinforcement Learning (RL), the MHPAC algorithm enhances the learning capabilities of the Actor-Critic (AC) algorithm specifically tailored for air-to-air confrontation scenarios. Furthermore, the MHP model is also integrated into the confrontation strategy to inform missile launch decisions. Using the MHPAC algorithm, the confrontation strategy is achieved via the training strategy of curriculum learning and self-play learning. Results demonstrate that the MHPAC algorithm effectively explores efficient maneuvering strategies for missile launch and defense, overcoming challenges associated with sparse and delayed reward signals. The decision-making capabilities of the integrated maneuvering and missile launch strategy are significantly enhanced by the proposed MHPAC algorithm, with a relative win ratio of over 65% against different strategies. Moreover, the trained strategy only needs 0.039 s for real-time decision-making. This research holds considerable promise for achieving air superiority and mission success in complex and dynamic aerial environments.

AB - In recent years, autonomous decision-making has emerged as a critical technology in air-to-air confrontation scenarios, garnering significant attention. This paper presents a novel AI algorithm, the Missile Hit Probability Enhanced Actor-Critic (MHPAC), designed for autonomous decision-making in such confrontations, whose primary objective is to maximize the probability of defeating opponents while minimizing the risk of being shot down. By incorporating a pre-trained Missile Hit Probability (MHP) model into reward shaping and exploration within the framework of Reinforcement Learning (RL), the MHPAC algorithm enhances the learning capabilities of the Actor-Critic (AC) algorithm specifically tailored for air-to-air confrontation scenarios. Furthermore, the MHP model is also integrated into the confrontation strategy to inform missile launch decisions. Using the MHPAC algorithm, the confrontation strategy is achieved via the training strategy of curriculum learning and self-play learning. Results demonstrate that the MHPAC algorithm effectively explores efficient maneuvering strategies for missile launch and defense, overcoming challenges associated with sparse and delayed reward signals. The decision-making capabilities of the integrated maneuvering and missile launch strategy are significantly enhanced by the proposed MHPAC algorithm, with a relative win ratio of over 65% against different strategies. Moreover, the trained strategy only needs 0.039 s for real-time decision-making. This research holds considerable promise for achieving air superiority and mission success in complex and dynamic aerial environments.

KW - Actor-critic

KW - Air-to-air confrontation

KW - Autonomous decision-making

KW - Missile hit probability

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85196718581&partnerID=8YFLogxK

U2 - 10.1016/j.ast.2024.109285

DO - 10.1016/j.ast.2024.109285

M3 - Article

AN - SCOPUS:85196718581

SN - 1270-9638

VL - 151

JO - Aerospace Science and Technology

JF - Aerospace Science and Technology

M1 - 109285

ER -

Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this