TY - JOUR
T1 - Optimal Output Feedback Learning Control for Continuous-Time Linear Quadratic Regulation
AU - Xie, Kedi
AU - Guay, Martin
AU - Lu, Maobin
AU - Wang, Shimin
AU - Deng, Fang
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - The classical linear quadratic regulation (LQR) problem of linear systems by state feedback has been widely addressed. However, the LQR problem by dynamic output feedback with optimal transient performance remains open. The main reason is that the observer error inevitably leads to suboptimal transient performance of the closed-loop system. In this paper, we propose an optimal dynamic output feedback learning control approach to solve the LQR problem of linear continuous-time systems with unknown dynamics. In particular, we propose a novel internal dynamics called the internal model. Unlike the classical p-copy internal model, it is driven by the input and output of the system, and the role of the proposed internal model is to compensate for the transient error of the observer such that the output feedback LQR problem is solved with guaranteed optimality. A model-free learning algorithm is developed to estimate the optimal control gain of the dynamic output feedback controller. The algorithm does not require any prior knowledge of the system matrices or the system's initial state, thus leading to an optimal solution to the model-free LQR problem. The effectiveness of the proposed method is illustrated using an aircraft control system.
AB - The classical linear quadratic regulation (LQR) problem of linear systems by state feedback has been widely addressed. However, the LQR problem by dynamic output feedback with optimal transient performance remains open. The main reason is that the observer error inevitably leads to suboptimal transient performance of the closed-loop system. In this paper, we propose an optimal dynamic output feedback learning control approach to solve the LQR problem of linear continuous-time systems with unknown dynamics. In particular, we propose a novel internal dynamics called the internal model. Unlike the classical p-copy internal model, it is driven by the input and output of the system, and the role of the proposed internal model is to compensate for the transient error of the observer such that the output feedback LQR problem is solved with guaranteed optimality. A model-free learning algorithm is developed to estimate the optimal control gain of the dynamic output feedback controller. The algorithm does not require any prior knowledge of the system matrices or the system's initial state, thus leading to an optimal solution to the model-free LQR problem. The effectiveness of the proposed method is illustrated using an aircraft control system.
KW - linear quadratic regulation
KW - observer error
KW - Optimal output feedback control
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85216082591&partnerID=8YFLogxK
U2 - 10.1109/TAC.2025.3532182
DO - 10.1109/TAC.2025.3532182
M3 - Article
AN - SCOPUS:85216082591
SN - 0018-9286
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
ER -