Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme

Yongfeng Lv; Xuemei Ren; Shuangyi Hu; Hao Xu

doi:10.1007/s12555-018-0551-6

Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme

Yongfeng Lv, Xuemei Ren^*, Shuangyi Hu, Hao Xu

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

17 引用（Scopus）

摘要

A reinforcement learning (RL) based adaptive dynamic programming (ADP) is developed to learn the approximate optimal stabilization input of the servo mechanisms, where the unknown system dynamics are approximated with a three-layer neural network (NN) identifier. First, the servo mechanism model is constructed and a three-layer NN identifier is used to approximate the unknown servo system. The NN weights of both the hidden layer and output layer are synchronously tuned with an adaptive gradient law. An RL-based critic three-layer NN is then used to learn the optimal cost function, where NN weights of the first layer are set as constants, NN weights of the second layer are updated by minimizing the squared Hamilton-Jacobi-Bellman (HJB) error. The optimal stabilization input of the servomechanism is obtained based on the three-layer NN identifier and RL-based critic NN scheme, which can stabilize the motor speed from its initial value to the given value. Moreover, the convergence analysis of the identifier and RL-based critic NN is proved, the stability of the cost function with the proposed optimal input is analyzed. Finally, a servo mechanism model and a complex system are provided to verify the correctness of the proposed methods.

源语言	英语
页（从-至）	2655-2665
页数	11
期刊	International Journal of Control, Automation and Systems
卷	17
期	10
DOI	https://doi.org/10.1007/s12555-018-0551-6
出版状态	已出版 - 1 10月 2019

访问文件

10.1007/s12555-018-0551-6

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{9f810867784a4cd48afb3a533d4b59fb,

title = "Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme",

abstract = "A reinforcement learning (RL) based adaptive dynamic programming (ADP) is developed to learn the approximate optimal stabilization input of the servo mechanisms, where the unknown system dynamics are approximated with a three-layer neural network (NN) identifier. First, the servo mechanism model is constructed and a three-layer NN identifier is used to approximate the unknown servo system. The NN weights of both the hidden layer and output layer are synchronously tuned with an adaptive gradient law. An RL-based critic three-layer NN is then used to learn the optimal cost function, where NN weights of the first layer are set as constants, NN weights of the second layer are updated by minimizing the squared Hamilton-Jacobi-Bellman (HJB) error. The optimal stabilization input of the servomechanism is obtained based on the three-layer NN identifier and RL-based critic NN scheme, which can stabilize the motor speed from its initial value to the given value. Moreover, the convergence analysis of the identifier and RL-based critic NN is proved, the stability of the cost function with the proposed optimal input is analyzed. Finally, a servo mechanism model and a complex system are provided to verify the correctness of the proposed methods.",

keywords = "Adaptive dynamic programming, neural networks, optimal control, reinforcement learning, servomechanisms",

author = "Yongfeng Lv and Xuemei Ren and Shuangyi Hu and Hao Xu",

note = "Publisher Copyright: {\textcopyright} 2019, ICROS, KIEE and Springer.",

year = "2019",

month = oct,

day = "1",

doi = "10.1007/s12555-018-0551-6",

language = "English",

volume = "17",

pages = "2655--2665",

journal = "International Journal of Control, Automation and Systems",

issn = "1598-6446",

publisher = "Institute of Control, Robotics and Systems",

number = "10",

}

TY - JOUR

T1 - Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme

AU - Lv, Yongfeng

AU - Ren, Xuemei

AU - Hu, Shuangyi

AU - Xu, Hao

PY - 2019/10/1

Y1 - 2019/10/1

N2 - A reinforcement learning (RL) based adaptive dynamic programming (ADP) is developed to learn the approximate optimal stabilization input of the servo mechanisms, where the unknown system dynamics are approximated with a three-layer neural network (NN) identifier. First, the servo mechanism model is constructed and a three-layer NN identifier is used to approximate the unknown servo system. The NN weights of both the hidden layer and output layer are synchronously tuned with an adaptive gradient law. An RL-based critic three-layer NN is then used to learn the optimal cost function, where NN weights of the first layer are set as constants, NN weights of the second layer are updated by minimizing the squared Hamilton-Jacobi-Bellman (HJB) error. The optimal stabilization input of the servomechanism is obtained based on the three-layer NN identifier and RL-based critic NN scheme, which can stabilize the motor speed from its initial value to the given value. Moreover, the convergence analysis of the identifier and RL-based critic NN is proved, the stability of the cost function with the proposed optimal input is analyzed. Finally, a servo mechanism model and a complex system are provided to verify the correctness of the proposed methods.

AB - A reinforcement learning (RL) based adaptive dynamic programming (ADP) is developed to learn the approximate optimal stabilization input of the servo mechanisms, where the unknown system dynamics are approximated with a three-layer neural network (NN) identifier. First, the servo mechanism model is constructed and a three-layer NN identifier is used to approximate the unknown servo system. The NN weights of both the hidden layer and output layer are synchronously tuned with an adaptive gradient law. An RL-based critic three-layer NN is then used to learn the optimal cost function, where NN weights of the first layer are set as constants, NN weights of the second layer are updated by minimizing the squared Hamilton-Jacobi-Bellman (HJB) error. The optimal stabilization input of the servomechanism is obtained based on the three-layer NN identifier and RL-based critic NN scheme, which can stabilize the motor speed from its initial value to the given value. Moreover, the convergence analysis of the identifier and RL-based critic NN is proved, the stability of the cost function with the proposed optimal input is analyzed. Finally, a servo mechanism model and a complex system are provided to verify the correctness of the proposed methods.

KW - Adaptive dynamic programming

KW - neural networks

KW - optimal control

KW - reinforcement learning

KW - servomechanisms

UR - http://www.scopus.com/inward/record.url?scp=85068879434&partnerID=8YFLogxK

U2 - 10.1007/s12555-018-0551-6

DO - 10.1007/s12555-018-0551-6

M3 - Article

AN - SCOPUS:85068879434

SN - 1598-6446

VL - 17

SP - 2655

EP - 2665

JO - International Journal of Control, Automation and Systems

JF - International Journal of Control, Automation and Systems

IS - 10

ER -

Approximate Optimal Stabilization Control of Servo Mechanisms based on Reinforcement Learning Scheme

摘要

访问文件

其它文件与链接

指纹

引用此