Decision-making models on perceptual uncertainty with distributional reinforcement learning

Shuyuan Xu; Qiao Liu; Yuhui Hu; Mengtian Xu; Jiachen Hao

doi:10.1016/j.geits.2022.100062

Decision-making models on perceptual uncertainty with distributional reinforcement learning

Shuyuan Xu, Qiao Liu, Yuhui Hu^*, Mengtian Xu, Jiachen Hao

^*此作品的通讯作者

机械与车辆学院

科研成果: 期刊稿件 › 文章 › 同行评审

9 引用（Scopus）

摘要

Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

源语言	英语
文章编号	100062
期刊	Green Energy and Intelligent Transportation
卷	2
期	2
DOI	https://doi.org/10.1016/j.geits.2022.100062
出版状态	已出版 - 4月 2023

访问文件

10.1016/j.geits.2022.100062

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{0d583150a5c8462bb9979ad6fdc90b66,

title = "Decision-making models on perceptual uncertainty with distributional reinforcement learning",

abstract = "Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.",

keywords = "Autonomous vehicles, Partially observable markov decision process, Reinforcement learning, Sensing occlusion, Unsiganlized intersections",

author = "Shuyuan Xu and Qiao Liu and Yuhui Hu and Mengtian Xu and Jiachen Hao",

note = "Publisher Copyright: {\textcopyright} 2022 The Author(s)",

year = "2023",

month = apr,

doi = "10.1016/j.geits.2022.100062",

language = "English",

volume = "2",

journal = "Green Energy and Intelligent Transportation",

issn = "2773-1537",

publisher = "Elsevier B.V.",

number = "2",

}

TY - JOUR

T1 - Decision-making models on perceptual uncertainty with distributional reinforcement learning

AU - Xu, Shuyuan

AU - Liu, Qiao

AU - Hu, Yuhui

AU - Xu, Mengtian

AU - Hao, Jiachen

PY - 2023/4

Y1 - 2023/4

N2 - Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

AB - Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

KW - Autonomous vehicles

KW - Partially observable markov decision process

KW - Reinforcement learning

KW - Sensing occlusion

KW - Unsiganlized intersections

UR - http://www.scopus.com/inward/record.url?scp=85160511070&partnerID=8YFLogxK

U2 - 10.1016/j.geits.2022.100062

DO - 10.1016/j.geits.2022.100062

M3 - Article

AN - SCOPUS:85160511070

SN - 2773-1537

VL - 2

JO - Green Energy and Intelligent Transportation

JF - Green Energy and Intelligent Transportation

IS - 2

M1 - 100062

ER -

Decision-making models on perceptual uncertainty with distributional reinforcement learning

摘要

访问文件

其它文件与链接

指纹

引用此