TY - JOUR
T1 - Reinforcement Learning-Based Model Predictive Control for Discrete-Time Systems
AU - Lin, Min
AU - Sun, Zhongqi
AU - Xia, Yuanqing
AU - Zhang, Jinhui
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024/3/1
Y1 - 2024/3/1
N2 - This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.
AB - This article proposes a novel reinforcement learning-based model predictive control (RLMPC) scheme for discrete-time systems. The scheme integrates model predictive control (MPC) and reinforcement learning (RL) through policy iteration (PI), where MPC is a policy generator and the RL technique is employed to evaluate the policy. Then the obtained value function is taken as the terminal cost of MPC, thus improving the generated policy. The advantage of doing so is that it rules out the need for the offline design paradigm of the terminal cost, the auxiliary controller, and the terminal constraint in traditional MPC. Moreover, RLMPC proposed in this article enables a more flexible choice of prediction horizon due to the elimination of the terminal constraint, which has great potential in reducing the computational burden. We provide a rigorous analysis of the convergence, feasibility, and stability properties of RLMPC. Simulation results show that RLMPC achieves nearly the same performance as traditional MPC in the control of linear systems and exhibits superiority over traditional MPC for nonlinear ones.
KW - Discrete-time systems
KW - model predictive control
KW - policy iteration (PI)
KW - reinforcement learning (RL)
UR - http://www.scopus.com/inward/record.url?scp=85160228579&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2023.3273590
DO - 10.1109/TNNLS.2023.3273590
M3 - Article
C2 - 37204957
AN - SCOPUS:85160228579
SN - 2162-237X
VL - 35
SP - 3312
EP - 3324
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 3
ER -