A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration

Yue Wang; Kexv Li; Xing Zhuang; Xinyu Liu; Hanyu Li

doi:10.3390/aerospace10070642

A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration

Yue Wang^*, Kexv Li, Xing Zhuang, Xinyu Liu, Hanyu Li

^*此作品的通讯作者

机电学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) and combines this method with the soft actor critic (SAC) algorithm to form the TCD-SAC algorithm. To compare the performance of the TCD-SAC algorithm with other related baseline algorithms, this study builds a dynamic environment, a UAV game, and conducts training and testing experiments in this environment. The results show that among all the algorithms, the TCD-SAC algorithm has the highest sample utilization rate and the best actual penetration results, and the algorithm has a good adaptability and robustness in dynamic environments.

源语言	英语
文章编号	642
期刊	Aerospace
卷	10
期	7
DOI	https://doi.org/10.3390/aerospace10070642
出版状态	已出版 - 7月 2023

访问文件

10.3390/aerospace10070642

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{343a40c2bf5f46078d2deea893cbaea9,

title = "A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration",

abstract = "The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) and combines this method with the soft actor critic (SAC) algorithm to form the TCD-SAC algorithm. To compare the performance of the TCD-SAC algorithm with other related baseline algorithms, this study builds a dynamic environment, a UAV game, and conducts training and testing experiments in this environment. The results show that among all the algorithms, the TCD-SAC algorithm has the highest sample utilization rate and the best actual penetration results, and the algorithm has a good adaptability and robustness in dynamic environments.",

keywords = "UAV penetration, reinforcement learning, sample utilization, task completion division",

author = "Yue Wang and Kexv Li and Xing Zhuang and Xinyu Liu and Hanyu Li",

note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

month = jul,

doi = "10.3390/aerospace10070642",

language = "English",

volume = "10",

journal = "Aerospace",

issn = "2226-4310",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "7",

}

TY - JOUR

T1 - A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration

AU - Wang, Yue

AU - Li, Kexv

AU - Zhuang, Xing

AU - Liu, Xinyu

AU - Li, Hanyu

PY - 2023/7

Y1 - 2023/7

N2 - The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) and combines this method with the soft actor critic (SAC) algorithm to form the TCD-SAC algorithm. To compare the performance of the TCD-SAC algorithm with other related baseline algorithms, this study builds a dynamic environment, a UAV game, and conducts training and testing experiments in this environment. The results show that among all the algorithms, the TCD-SAC algorithm has the highest sample utilization rate and the best actual penetration results, and the algorithm has a good adaptability and robustness in dynamic environments.

AB - The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) and combines this method with the soft actor critic (SAC) algorithm to form the TCD-SAC algorithm. To compare the performance of the TCD-SAC algorithm with other related baseline algorithms, this study builds a dynamic environment, a UAV game, and conducts training and testing experiments in this environment. The results show that among all the algorithms, the TCD-SAC algorithm has the highest sample utilization rate and the best actual penetration results, and the algorithm has a good adaptability and robustness in dynamic environments.

KW - UAV penetration

KW - reinforcement learning

KW - sample utilization

KW - task completion division

UR - http://www.scopus.com/inward/record.url?scp=85165939682&partnerID=8YFLogxK

U2 - 10.3390/aerospace10070642

DO - 10.3390/aerospace10070642

M3 - Article

AN - SCOPUS:85165939682

SN - 2226-4310

VL - 10

JO - Aerospace

JF - Aerospace

IS - 7

M1 - 642

ER -

A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration

摘要

访问文件

其它文件与链接

指纹

引用此