Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

Yahao Xu; Yiran Wei; Keyang Jiang; Li Chen; Di Wang; Hongbin Deng

doi:10.1016/j.neucom.2023.03.054

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

Yahao Xu, Yiran Wei^*, Keyang Jiang, Li Chen, Di Wang, Hongbin Deng

^*此作品的通讯作者

机电学院

科研成果: 期刊稿件 › 文章 › 同行评审

8 引用（Scopus）

摘要

Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continuous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algorithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experiment based on Pixhawk to verify the feasibility of our algorithm.

源语言	英语
页（从-至）	141-151
页数	11
期刊	Neurocomputing
卷	537
DOI	https://doi.org/10.1016/j.neucom.2023.03.054
出版状态	已出版 - 7 6月 2023

访问文件

10.1016/j.neucom.2023.03.054

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c4e179b37e534c6290129fa4731f0f27,

title = "Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces",

abstract = "Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continuous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algorithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experiment based on Pixhawk to verify the feasibility of our algorithm.",

keywords = "Hybrid action space, Reinforcement learning, SAC, Visual perception",

author = "Yahao Xu and Yiran Wei and Keyang Jiang and Li Chen and Di Wang and Hongbin Deng",

note = "Publisher Copyright: {\textcopyright} 2023",

year = "2023",

month = jun,

day = "7",

doi = "10.1016/j.neucom.2023.03.054",

language = "English",

volume = "537",

pages = "141--151",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

AU - Xu, Yahao

AU - Wei, Yiran

AU - Jiang, Keyang

AU - Chen, Li

AU - Wang, Di

AU - Deng, Hongbin

PY - 2023/6/7

Y1 - 2023/6/7

N2 - Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continuous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algorithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experiment based on Pixhawk to verify the feasibility of our algorithm.

AB - Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continuous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algorithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experiment based on Pixhawk to verify the feasibility of our algorithm.

KW - Hybrid action space

KW - Reinforcement learning

KW - SAC

KW - Visual perception

UR - http://www.scopus.com/inward/record.url?scp=85151512958&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2023.03.054

DO - 10.1016/j.neucom.2023.03.054

M3 - Article

AN - SCOPUS:85151512958

SN - 0925-2312

VL - 537

SP - 141

EP - 151

JO - Neurocomputing

JF - Neurocomputing

ER -

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

摘要

访问文件

其它文件与链接

指纹

引用此