TY - GEN
T1 - An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems
AU - Zhang, Xiaofei
AU - Ma, Hongbin
AU - Chen, Junyong
AU - Li, Weixue
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.
AB - Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.
KW - Adaptive Dynamic Programming
KW - Discrete-time Systems
KW - Extreme Learning Machine
UR - http://www.scopus.com/inward/record.url?scp=85125167753&partnerID=8YFLogxK
U2 - 10.1109/CCDC52312.2021.9601954
DO - 10.1109/CCDC52312.2021.9601954
M3 - Conference contribution
AN - SCOPUS:85125167753
T3 - Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021
SP - 3006
EP - 3011
BT - Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 33rd Chinese Control and Decision Conference, CCDC 2021
Y2 - 22 May 2021 through 24 May 2021
ER -