跳到主要导航 跳到搜索 跳到主要内容

Advancing Autonomous BVR Air Combat: Integrated Strategy Optimization and Adaptive Learning

  • Wenfei Wang*
  • , Le Ru
  • , Maolong Lv
  • , Hailong Xi
  • , Li Mo
  • *此作品的通讯作者
  • Air Force Engineering University Xian

科研成果: 期刊稿件文章同行评审

摘要

In modern air combat, the complexity of beyond visual range (BVR) engagements stems from the dynamic nature of the game environment, numerous confrontation factors, and continually evolving strategies. These challenges underscore the limitations of traditional algorithms in addressing BVR decision-making problems. To overcome these issues, this paper proposes a decision-making method that combines strategy reuse with demonstration guidance, aiming to enhance the learning efficiency and adaptability of agents in complex environments. Additionally, a role-based agent framework and a collaborative learning mechanism are introduced to diversify strategies and promote group cooperation. By incorporating a strategy training approach driven by strategy entropy, the proposed method further improves the adaptability and robustness of agents in complex BVR air combat scenarios. Simulation results validate the effectiveness and superiority of the approach in BVR air combat scenarios, highlighting its efficiency and stability in the decision-making process. Note to Practitioners - In modern beyond visual range air combat, pilots and UAV systems are faced with a highly complex and dynamic confrontation environment. Traditional decision-making methods (such as rule system or traditional reinforcement learning) are often difficult to make flexible, diverse and adaptive maneuvering decisions in a short time, resulting in tactical rigidity, low learning efficiency and lack of strategic diversity in confrontation, which affects the overall operational effectiveness. Therefore, this paper puts forward an intelligent air combat decision-making method based on improved reinforcement learning, which combines strategy reuse and demonstration guidance mechanism to significantly improve the learning efficiency and decision-making adaptability in the initial stage of training. Enhance tactical diversity and confrontation ability through role-based agent design and collaborative learning mechanism. The training method driven by strategic entropy is adopted to ensure that the system can still maintain robustness in the face of uncertain confrontation. The decision-making method in this paper can be embedded in the intelligent decision-making module of UAV or manned/unmanned cooperative air combat system, which supports real-time tactical generation and dynamic adjustment. At the same time, it can also be embedded into the confrontation training simulator as an AI opponent to support professionals such as pilots to conduct confrontation training with them. The strategy reuse and demonstration guidance mechanism in this decision-making method is suitable for scenes with high training cost and low initial exploration efficiency, such as robot control and industrial control, while the role-based multi-agent collaborative learning method can be used for reference in multi-agent systems with clear division of tasks and cooperative completion, such as logistics robot scheduling. In the simulation experiment of this paper, the simulation results show that the decision-making method is superior to the traditional reinforcement learning method in decision-making speed, cost-effectiveness ratio and winning rate, and has strong engineering landing potential. At present, the decision-making method still depends on the high-fidelity simulation environment, which is sensitive to the model accuracy, and its generalization ability in extreme confrontation scenarios needs further verification. In the future, we can introduce online transfer learning mechanism and combination of virtual and real training methods to further improve the practicability and transfer ability of decision-making methods.

源语言英语
页(从-至)9417-9435
页数19
期刊IEEE Transactions on Automation Science and Engineering
23
DOI
出版状态已出版 - 2026

指纹

探究 'Advancing Autonomous BVR Air Combat: Integrated Strategy Optimization and Adaptive Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此