TY - GEN
T1 - A class of optimal control problem for stochastic discrete-time systems with average reward reinforcement learning
AU - Hu, Yifan
AU - Fu, Junjie
AU - Lv, Yuezu
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/5/10
Y1 - 2021/5/10
N2 - In this paper, a class of optimal control problem for stochastic discrete-time systems is addressed by average reward reinforcement learning. First, the optimal control problem of the stochastic discrete-time system is transformed into a sequential decision problem for Markov decision process (MDP). It is proven that the admissible policies are gain-optimal and the optimal policy is bias-optimal with the average reward criterion, respectively. Then, sufficient conditions to almost surely (a.s.) stabilize the system are proposed. Based on the above results, an on-policy average-reward-based reinforcement learning algorithm is developed. Finally, simulation results are provided to illustrate the effectiveness of the proposed algorithm.
AB - In this paper, a class of optimal control problem for stochastic discrete-time systems is addressed by average reward reinforcement learning. First, the optimal control problem of the stochastic discrete-time system is transformed into a sequential decision problem for Markov decision process (MDP). It is proven that the admissible policies are gain-optimal and the optimal policy is bias-optimal with the average reward criterion, respectively. Then, sufficient conditions to almost surely (a.s.) stabilize the system are proposed. Based on the above results, an on-policy average-reward-based reinforcement learning algorithm is developed. Finally, simulation results are provided to illustrate the effectiveness of the proposed algorithm.
KW - Average reward
KW - Optimal control
KW - Reinforcement learning
KW - Stochastic discrete-time system
UR - http://www.scopus.com/inward/record.url?scp=85112361849&partnerID=8YFLogxK
U2 - 10.1109/ICPS49255.2021.9468152
DO - 10.1109/ICPS49255.2021.9468152
M3 - Conference contribution
AN - SCOPUS:85112361849
T3 - Proceedings - 2021 4th IEEE International Conference on Industrial Cyber-Physical Systems, ICPS 2021
SP - 829
EP - 834
BT - Proceedings - 2021 4th IEEE International Conference on Industrial Cyber-Physical Systems, ICPS 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Industrial Cyber-Physical Systems, ICPS 2021
Y2 - 10 May 2021 through 13 May 2021
ER -