Decision-making models on perceptual uncertainty with distributional reinforcement learning

Shuyuan Xu; Qiao Liu; Yuhui Hu; Mengtian Xu; Jiachen Hao

doi:10.1016/j.geits.2022.100062

Decision-making models on perceptual uncertainty with distributional reinforcement learning

Shuyuan Xu, Qiao Liu, Yuhui Hu^*, Mengtian Xu, Jiachen Hao

^*Corresponding author for this work

School of Mechanical Engineering

Research output: Contribution to journal › Article › peer-review

9 Citations (Scopus)

Abstract

Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

Original language	English
Article number	100062
Journal	Green Energy and Intelligent Transportation
Volume	2
Issue number	2
DOIs	https://doi.org/10.1016/j.geits.2022.100062
Publication status	Published - Apr 2023

Keywords

Autonomous vehicles
Partially observable markov decision process
Reinforcement learning
Sensing occlusion
Unsiganlized intersections

Access to Document

10.1016/j.geits.2022.100062

Cite this

@article{0d583150a5c8462bb9979ad6fdc90b66,

title = "Decision-making models on perceptual uncertainty with distributional reinforcement learning",

abstract = "Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.",

keywords = "Autonomous vehicles, Partially observable markov decision process, Reinforcement learning, Sensing occlusion, Unsiganlized intersections",

author = "Shuyuan Xu and Qiao Liu and Yuhui Hu and Mengtian Xu and Jiachen Hao",

note = "Publisher Copyright: {\textcopyright} 2022 The Author(s)",

year = "2023",

month = apr,

doi = "10.1016/j.geits.2022.100062",

language = "English",

volume = "2",

journal = "Green Energy and Intelligent Transportation",

issn = "2773-1537",

publisher = "Elsevier B.V.",

number = "2",

}

TY - JOUR

T1 - Decision-making models on perceptual uncertainty with distributional reinforcement learning

AU - Xu, Shuyuan

AU - Liu, Qiao

AU - Hu, Yuhui

AU - Xu, Mengtian

AU - Hao, Jiachen

PY - 2023/4

Y1 - 2023/4

N2 - Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

AB - Decision-making for autonomous vehicles in the presence of obstacle occlusions is difficult because the lack of accurate information affects the judgment. Existing methods may lead to overly conservative strategies and time-consuming computations that cannot be balanced with efficiency. We propose to use distributional reinforcement learning to hedge the risk of strategies, optimize the worse cases, and improve the efficiency of the algorithm so that the agent learns better actions. A batch of smaller values is used to replace the average value to optimize the worse case, and combined with frame stacking, we call it Efficient-Fully parameterized Quantile Function (E-FQF). This model is used to evaluate signal-free intersection crossing scenarios and makes more efficient moves and reduces the collision rate compared to conventional reinforcement learning algorithms in the presence of perceived occlusion. The model also has robustness in the case of data loss compared to the method with embedded long and short term memory.

KW - Autonomous vehicles

KW - Partially observable markov decision process

KW - Reinforcement learning

KW - Sensing occlusion

KW - Unsiganlized intersections

UR - http://www.scopus.com/inward/record.url?scp=85160511070&partnerID=8YFLogxK

U2 - 10.1016/j.geits.2022.100062

DO - 10.1016/j.geits.2022.100062

M3 - Article

AN - SCOPUS:85160511070

SN - 2773-1537

VL - 2

JO - Green Energy and Intelligent Transportation

JF - Green Energy and Intelligent Transportation

IS - 2

M1 - 100062

ER -

Decision-making models on perceptual uncertainty with distributional reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this