An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems

Xiaofei Zhang; Hongbin Ma; Junyong Chen; Weixue Li

doi:10.1109/CCDC52312.2021.9601954

An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems

Xiaofei Zhang, Hongbin Ma^*, Junyong Chen, Weixue Li

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.

Original language	English
Title of host publication	Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	3006-3011
Number of pages	6
ISBN (Electronic)	9781665440899
DOIs	https://doi.org/10.1109/CCDC52312.2021.9601954
Publication status	Published - 2021
Event	33rd Chinese Control and Decision Conference, CCDC 2021 - Kunming, China Duration: 22 May 2021 → 24 May 2021

Publication series

Name	Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021

Conference

Conference	33rd Chinese Control and Decision Conference, CCDC 2021
Country/Territory	China
City	Kunming
Period	22/05/21 → 24/05/21

Keywords

Adaptive Dynamic Programming
Discrete-time Systems
Extreme Learning Machine

Access to Document

10.1109/CCDC52312.2021.9601954

Cite this

Zhang, X., Ma, H., Chen, J., & Li, W. (2021). An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems. In Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021 (pp. 3006-3011). (Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CCDC52312.2021.9601954

Zhang, Xiaofei ; Ma, Hongbin ; Chen, Junyong et al. / An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems. Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 3006-3011 (Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021).

@inproceedings{e15391ec738f4d3abaecf29fe40e056a,

title = "An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems",

abstract = "Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.",

keywords = "Adaptive Dynamic Programming, Discrete-time Systems, Extreme Learning Machine",

author = "Xiaofei Zhang and Hongbin Ma and Junyong Chen and Weixue Li",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 33rd Chinese Control and Decision Conference, CCDC 2021 ; Conference date: 22-05-2021 Through 24-05-2021",

year = "2021",

doi = "10.1109/CCDC52312.2021.9601954",

language = "English",

series = "Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "3006--3011",

booktitle = "Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021",

address = "United States",

}

Zhang, X, Ma, H, Chen, J & Li, W 2021, An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems. in Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021. Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021, Institute of Electrical and Electronics Engineers Inc., pp. 3006-3011, 33rd Chinese Control and Decision Conference, CCDC 2021, Kunming, China, 22/05/21. https://doi.org/10.1109/CCDC52312.2021.9601954

An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems. / Zhang, Xiaofei; Ma, Hongbin; Chen, Junyong et al.
Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. p. 3006-3011 (Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems

AU - Zhang, Xiaofei

AU - Ma, Hongbin

AU - Chen, Junyong

AU - Li, Weixue

PY - 2021

Y1 - 2021

N2 - Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.

AB - Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.

KW - Adaptive Dynamic Programming

KW - Discrete-time Systems

KW - Extreme Learning Machine

UR - http://www.scopus.com/inward/record.url?scp=85125167753&partnerID=8YFLogxK

U2 - 10.1109/CCDC52312.2021.9601954

DO - 10.1109/CCDC52312.2021.9601954

M3 - Conference contribution

AN - SCOPUS:85125167753

T3 - Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021

SP - 3006

EP - 3011

BT - Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 33rd Chinese Control and Decision Conference, CCDC 2021

Y2 - 22 May 2021 through 24 May 2021

ER -

Zhang X, Ma H, Chen J, Li W. An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems. In Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021. Institute of Electrical and Electronics Engineers Inc. 2021. p. 3006-3011. (Proceedings of the 33rd Chinese Control and Decision Conference, CCDC 2021). doi: 10.1109/CCDC52312.2021.9601954

An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this