Multi-Attacker Multi-Defender Target Guarding Game Using Hierarchical Reinforcement Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This study proposes a Hierarchical Reinforcement Learning (HRL) framework for multi-agent target guarding in dynamic environments with respectively coordinated attackers and defenders. The framework decomposes defender actions into patrol, pursuit, and encirclement subtasks, with a high-level multi-head attention mechanism dynamically allocating subtasks based on global observations. Defenders trained with Multi-Agent Proximal Policy Optimization (MAPPO) execute a low-level policy to engage subtasks. Customized reward functions promote collision avoidance, target protection, and coordination: patrol rewards optimize circular surveillance, pursuit rewards minimize distances to attackers, and encirclement rewards enhance cooperation. Attackers employ evasion tactics to breach defenses. Simulations in 2D environments demonstrate effective subtask transitions and coordinated interceptions, validating the framework's robustness against nonstationary interactions.

Original languageEnglish
Title of host publicationProceedings of the 44th Chinese Control Conference, CCC 2025
EditorsJian Sun, Hongpeng Yin
PublisherIEEE Computer Society
Pages2743-2748
Number of pages6
ISBN (Electronic)9789887581611
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event44th Chinese Control Conference, CCC 2025 - Chongqing, China
Duration: 28 Jul 202530 Jul 2025

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference44th Chinese Control Conference, CCC 2025
Country/TerritoryChina
CityChongqing
Period28/07/2530/07/25

Keywords

  • Hierarchical Reinforcement learning
  • Multi-Agent Systems
  • Target Guarding Game

Fingerprint

Dive into the research topics of 'Multi-Attacker Multi-Defender Target Guarding Game Using Hierarchical Reinforcement Learning'. Together they form a unique fingerprint.

Cite this