Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems

Min Lin; Zhongqi Sun; Yuanqing Xia; Jinhui Zhang

doi:10.1109/TNNLS.2023.3273590

Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems

Min Lin, Zhongqi Sun^*, Yuanqing Xia, Jinhui Zhang

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

7 引用（Scopus）

摘要

This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.

源语言	英语
页（从-至）	3312-3324
页数	13
期刊	IEEE Transactions on Neural Networks and Learning Systems
卷	35
期	3
DOI	https://doi.org/10.1109/TNNLS.2023.3273590
出版状态	已出版 - 1 3月 2024

访问文件

10.1109/TNNLS.2023.3273590

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{09749360da8b44d7abb920e348f5c933,

title = "Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems",

abstract = "This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.",

keywords = "Discrete-time systems, model predictive control, policy iteration (PI), reinforcement learning (RL)",

author = "Min Lin and Zhongqi Sun and Yuanqing Xia and Jinhui Zhang",

note = "Publisher Copyright: {\textcopyright} 2012 IEEE.",

year = "2024",

month = mar,

day = "1",

doi = "10.1109/TNNLS.2023.3273590",

language = "English",

volume = "35",

pages = "3312--3324",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "3",

}

TY - JOUR

T1 - Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems

AU - Lin, Min

AU - Sun, Zhongqi

AU - Xia, Yuanqing

AU - Zhang, Jinhui

PY - 2024/3/1

Y1 - 2024/3/1

N2 - This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.

AB - This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.

KW - Discrete-time systems

KW - model predictive control

KW - policy iteration (PI)

KW - reinforcement learning (RL)

UR - http://www.scopus.com/inward/record.url?scp=85160228579&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2023.3273590

DO - 10.1109/TNNLS.2023.3273590

M3 - Article

C2 - 37204957

AN - SCOPUS:85160228579

SN - 2162-237X

VL - 35

SP - 3312

EP - 3324

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

IS - 3

ER -

Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems

摘要

访问文件

其它文件与链接

指纹

引用此