Off-Policy Q-Learning for Infinite Horizon LQR Problem with Unknown Dynamics

Xinxing Li, Zhihong Peng, Li Liang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

In this paper, a novel online Q-Iearning approach is proposed to solve the Infinite Horizon Linear Quadratic Regulator (IHLQR) problem for continuous-time (CT) linear time-invariant (LMI) systems. The proposed Q-Iearning algorithm employing off-policy reinforcement learning (RL) technology improves the exploration ability of Q-Iearning to the state space. During the learning process, the Q-Iearning algorithm can be implemented just using the data sets which just contains the information of the behavior policy and the corresponding system state, thus is data- driven. Moreover, the data sets can be used repeatedly, which is computationally efficient. A mild condition on probing noise is established to ensure the converge of the proposed Q-Iearning algorithm. Simulation results demonstrate the effectiveness of the developed algorithm.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 27th International Symposium on Industrial Electronics, ISIE 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages258-263
Number of pages6
ISBN (Print)9781538637050
DOIs
Publication statusPublished - 10 Aug 2018
Event27th IEEE International Symposium on Industrial Electronics, ISIE 2018 - Cairns, Australia
Duration: 13 Jun 201815 Jun 2018

Publication series

NameIEEE International Symposium on Industrial Electronics
Volume2018-June

Conference

Conference27th IEEE International Symposium on Industrial Electronics, ISIE 2018
Country/TerritoryAustralia
CityCairns
Period13/06/1815/06/18

Fingerprint

Dive into the research topics of 'Off-Policy Q-Learning for Infinite Horizon LQR Problem with Unknown Dynamics'. Together they form a unique fingerprint.

Cite this