Reinforcement learning for ramp control: An analysis of learning parameters

Chao Lu; Jie Huang; Jianwei Gong

doi:10.7307/ptt.v28i4.1830

Reinforcement learning for ramp control: An analysis of learning parameters

Chao Lu, Jie Huang^*, Jianwei Gong

^*此作品的通讯作者

机械与车辆学院

科研成果: 期刊稿件 › 文章 › 同行评审

7 引用（Scopus）

摘要

Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestions about how to select suitable parameter values that can achieve a superior performance were provided.

源语言	英语
页（从-至）	371-381
页数	11
期刊	Promet - Traffic and Transportation
卷	28
期	4
DOI	https://doi.org/10.7307/ptt.v28i4.1830
出版状态	已出版 - 8月 2016

访问文件

10.7307/ptt.v28i4.1830

其它文件与链接

链接到 Scopus 的出版物

引用此

Lu, C., Huang, J., & Gong, J. (2016). Reinforcement learning for ramp control: An analysis of learning parameters. Promet - Traffic and Transportation, 28(4), 371-381. https://doi.org/10.7307/ptt.v28i4.1830

@article{6e92fb9f2d344825aae074a52535ba11,

title = "Reinforcement learning for ramp control: An analysis of learning parameters",

abstract = "Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestions about how to select suitable parameter values that can achieve a superior performance were provided.",

keywords = "Agent, Macroscopic traffic flow model, Q-learning, Ramp control, Reinforcement learning",

author = "Chao Lu and Jie Huang and Jianwei Gong",

year = "2016",

month = aug,

doi = "10.7307/ptt.v28i4.1830",

language = "English",

volume = "28",

pages = "371--381",

journal = "Promet - Traffic and Transportation",

issn = "0353-5320",

publisher = "Faculty of Transport and Traffic Engineering",

number = "4",

}

TY - JOUR

T1 - Reinforcement learning for ramp control

T2 - An analysis of learning parameters

AU - Lu, Chao

AU - Huang, Jie

AU - Gong, Jianwei

PY - 2016/8

Y1 - 2016/8

N2 - Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestions about how to select suitable parameter values that can achieve a superior performance were provided.

AB - Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestions about how to select suitable parameter values that can achieve a superior performance were provided.

KW - Agent

KW - Macroscopic traffic flow model

KW - Q-learning

KW - Ramp control

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=84987909315&partnerID=8YFLogxK

U2 - 10.7307/ptt.v28i4.1830

DO - 10.7307/ptt.v28i4.1830

M3 - Article

AN - SCOPUS:84987909315

SN - 0353-5320

VL - 28

SP - 371

EP - 381

JO - Promet - Traffic and Transportation

JF - Promet - Traffic and Transportation

IS - 4

ER -

Reinforcement learning for ramp control: An analysis of learning parameters

摘要

访问文件

其它文件与链接

指纹

引用此