Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

Jingang Zhao; Minggang Gan

doi:10.1080/00207721.2020.1797223

Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

Jingang Zhao, Minggang Gan^*

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

13 引用（Scopus）

摘要

This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

源语言	英语
页（从-至）	2429-2440
页数	12
期刊	International Journal of Systems Science
卷	51
期	13
DOI	https://doi.org/10.1080/00207721.2020.1797223
出版状态	已出版 - 2 10月 2020

访问文件

10.1080/00207721.2020.1797223

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhao, J., & Gan, M. (2020). Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning. International Journal of Systems Science, 51(13), 2429-2440. https://doi.org/10.1080/00207721.2020.1797223

@article{d54b1d45363e4293bba37248653b4f9f,

title = "Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning",

abstract = "This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.",

keywords = "Finite-horizon, continuous-time, optimal control, reinforcement learning, uncertain nonlinear systems",

author = "Jingang Zhao and Minggang Gan",

note = "Publisher Copyright: {\textcopyright} 2020 Informa UK Limited, trading as Taylor & Francis Group.",

year = "2020",

month = oct,

day = "2",

doi = "10.1080/00207721.2020.1797223",

language = "English",

volume = "51",

pages = "2429--2440",

journal = "International Journal of Systems Science",

issn = "0020-7721",

publisher = "Taylor and Francis Ltd.",

number = "13",

}

TY - JOUR

T1 - Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

AU - Zhao, Jingang

AU - Gan, Minggang

PY - 2020/10/2

Y1 - 2020/10/2

N2 - This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

AB - This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

KW - Finite-horizon

KW - continuous-time

KW - optimal control

KW - reinforcement learning

KW - uncertain nonlinear systems

UR - http://www.scopus.com/inward/record.url?scp=85088869201&partnerID=8YFLogxK

U2 - 10.1080/00207721.2020.1797223

DO - 10.1080/00207721.2020.1797223

M3 - Article

AN - SCOPUS:85088869201

SN - 0020-7721

VL - 51

SP - 2429

EP - 2440

JO - International Journal of Systems Science

JF - International Journal of Systems Science

IS - 13

ER -

Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

摘要

访问文件

其它文件与链接

指纹

引用此