Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

Jingang Zhao; Minggang Gan

doi:10.1080/00207721.2020.1797223

Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

Jingang Zhao, Minggang Gan^*

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

13 Citations (Scopus)

Abstract

This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

Original language	English
Pages (from-to)	2429-2440
Number of pages	12
Journal	International Journal of Systems Science
Volume	51
Issue number	13
DOIs	https://doi.org/10.1080/00207721.2020.1797223
Publication status	Published - 2 Oct 2020

Keywords

Finite-horizon
continuous-time
optimal control
reinforcement learning
uncertain nonlinear systems

Access to Document

10.1080/00207721.2020.1797223

Cite this

Zhao, J., & Gan, M. (2020). Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning. International Journal of Systems Science, 51(13), 2429-2440. https://doi.org/10.1080/00207721.2020.1797223

@article{d54b1d45363e4293bba37248653b4f9f,

title = "Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning",

abstract = "This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.",

keywords = "Finite-horizon, continuous-time, optimal control, reinforcement learning, uncertain nonlinear systems",

author = "Jingang Zhao and Minggang Gan",

note = "Publisher Copyright: {\textcopyright} 2020 Informa UK Limited, trading as Taylor & Francis Group.",

year = "2020",

month = oct,

day = "2",

doi = "10.1080/00207721.2020.1797223",

language = "English",

volume = "51",

pages = "2429--2440",

journal = "International Journal of Systems Science",

issn = "0020-7721",

publisher = "Taylor and Francis Ltd.",

number = "13",

}

TY - JOUR

T1 - Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

AU - Zhao, Jingang

AU - Gan, Minggang

PY - 2020/10/2

Y1 - 2020/10/2

N2 - This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

AB - This paper investigates finite-horizon optimal control problem of continuous-time uncertain nonlinear systems. The uncertainty here refers to partially unknown system dynamics. Unlike the infinite-horizon, the difficulty of finite-horizon optimal control problem is that the Hamilton–Jacobi–Bellman (HJB) equation is time-varying and must meet certain terminal boundary constraints, which brings greater challenges. At the same time, the partially unknown system dynamics have also caused additional difficulties. The main innovation of this paper is the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm to approximately solve the time-varying HJB equation. The proposed algorithm mainly consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based reinforcement learning algorithm.

KW - Finite-horizon

KW - continuous-time

KW - optimal control

KW - reinforcement learning

KW - uncertain nonlinear systems

UR - http://www.scopus.com/inward/record.url?scp=85088869201&partnerID=8YFLogxK

U2 - 10.1080/00207721.2020.1797223

DO - 10.1080/00207721.2020.1797223

M3 - Article

AN - SCOPUS:85088869201

SN - 0020-7721

VL - 51

SP - 2429

EP - 2440

JO - International Journal of Systems Science

JF - International Journal of Systems Science

IS - 13

ER -

Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this