A multitasking-oriented robot arm motion planning scheme based on deep reinforcement learning and twin synchro-control

Chuzhao Liu; Junyao Gao; Yuanzhen Bi; Xuanyang Shi; Dingkui Tian

doi:10.3390/s20123515

A multitasking-oriented robot arm motion planning scheme based on deep reinforcement learning and twin synchro-control

Chuzhao Liu, Junyao Gao^*, Yuanzhen Bi, Xuanyang Shi, Dingkui Tian

^*此作品的通讯作者

机电学院

科研成果: 期刊稿件 › 文章 › 同行评审

22 引用（Scopus）

摘要

Humanoid robots are equipped with humanoid arms to make them more acceptable to the general public. Humanoid robots are a great challenge in robotics. The concept of digital twin technology complies with the guiding ideology of not only Industry 4.0, but also Made in China 2025. This paper proposes a scheme that combines deep reinforcement learning (DRL) with digital twin technology for controlling humanoid robot arms. For rapid and stable motion planning for humanoid robots, multitasking-oriented training using the twin synchro-control (TSC) scheme with DRL is proposed. For switching between tasks, the robot arm training must be quick and diverse. In this work, an approach for obtaining a priori knowledge as input to DRL is developed and verified using simulations. Two simple examples are developed in a simulation environment. We developed a data acquisition system to generate angle data efficiently and automatically. These data are used to improve the reward function of the deep deterministic policy gradient (DDPG) and quickly train the robot for a task. The approach is applied to a model of the humanoid robot BHR-6, a humanoid robot with multiple-motion mode and a sophisticated mechanical structure. Using the policies trained in the simulations, the humanoid robot can perform tasks that are not possible to train with existing methods. The training is fast and allows the robot to perform multiple tasks. Our approach utilizes human joint angle data collected by the data acquisition system to solve the problem of a sparse reward in DRL for two simple tasks. A comparison with simulation results for controllers trained using the vanilla DDPG show that the designed controller developed using the DDPG with the TSC scheme have great advantages in terms of learning stability and convergence speed.

源语言	英语
文章编号	3515
页（从-至）	1-35
页数	35
期刊	Sensors
卷	20
期	12
DOI	https://doi.org/10.3390/s20123515
出版状态	已出版 - 6月 2020

访问文件

10.3390/s20123515

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{9d155c72448641ba9d285df52612e7da,

title = "A multitasking-oriented robot arm motion planning scheme based on deep reinforcement learning and twin synchro-control",

abstract = "Humanoid robots are equipped with humanoid arms to make them more acceptable to the general public. Humanoid robots are a great challenge in robotics. The concept of digital twin technology complies with the guiding ideology of not only Industry 4.0, but also Made in China 2025. This paper proposes a scheme that combines deep reinforcement learning (DRL) with digital twin technology for controlling humanoid robot arms. For rapid and stable motion planning for humanoid robots, multitasking-oriented training using the twin synchro-control (TSC) scheme with DRL is proposed. For switching between tasks, the robot arm training must be quick and diverse. In this work, an approach for obtaining a priori knowledge as input to DRL is developed and verified using simulations. Two simple examples are developed in a simulation environment. We developed a data acquisition system to generate angle data efficiently and automatically. These data are used to improve the reward function of the deep deterministic policy gradient (DDPG) and quickly train the robot for a task. The approach is applied to a model of the humanoid robot BHR-6, a humanoid robot with multiple-motion mode and a sophisticated mechanical structure. Using the policies trained in the simulations, the humanoid robot can perform tasks that are not possible to train with existing methods. The training is fast and allows the robot to perform multiple tasks. Our approach utilizes human joint angle data collected by the data acquisition system to solve the problem of a sparse reward in DRL for two simple tasks. A comparison with simulation results for controllers trained using the vanilla DDPG show that the designed controller developed using the DDPG with the TSC scheme have great advantages in terms of learning stability and convergence speed.",

keywords = "Deep reinforcement learning, Humanoid robot, Twin synchro-control",

author = "Chuzhao Liu and Junyao Gao and Yuanzhen Bi and Xuanyang Shi and Dingkui Tian",

note = "Publisher Copyright: {\textcopyright} 2020 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2020",

month = jun,

doi = "10.3390/s20123515",

language = "English",

volume = "20",

pages = "1--35",

journal = "Sensors",

issn = "1424-8220",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "12",

}

TY - JOUR

T1 - A multitasking-oriented robot arm motion planning scheme based on deep reinforcement learning and twin synchro-control

AU - Liu, Chuzhao

AU - Gao, Junyao

AU - Bi, Yuanzhen

AU - Shi, Xuanyang

AU - Tian, Dingkui

PY - 2020/6

Y1 - 2020/6

N2 - Humanoid robots are equipped with humanoid arms to make them more acceptable to the general public. Humanoid robots are a great challenge in robotics. The concept of digital twin technology complies with the guiding ideology of not only Industry 4.0, but also Made in China 2025. This paper proposes a scheme that combines deep reinforcement learning (DRL) with digital twin technology for controlling humanoid robot arms. For rapid and stable motion planning for humanoid robots, multitasking-oriented training using the twin synchro-control (TSC) scheme with DRL is proposed. For switching between tasks, the robot arm training must be quick and diverse. In this work, an approach for obtaining a priori knowledge as input to DRL is developed and verified using simulations. Two simple examples are developed in a simulation environment. We developed a data acquisition system to generate angle data efficiently and automatically. These data are used to improve the reward function of the deep deterministic policy gradient (DDPG) and quickly train the robot for a task. The approach is applied to a model of the humanoid robot BHR-6, a humanoid robot with multiple-motion mode and a sophisticated mechanical structure. Using the policies trained in the simulations, the humanoid robot can perform tasks that are not possible to train with existing methods. The training is fast and allows the robot to perform multiple tasks. Our approach utilizes human joint angle data collected by the data acquisition system to solve the problem of a sparse reward in DRL for two simple tasks. A comparison with simulation results for controllers trained using the vanilla DDPG show that the designed controller developed using the DDPG with the TSC scheme have great advantages in terms of learning stability and convergence speed.

AB - Humanoid robots are equipped with humanoid arms to make them more acceptable to the general public. Humanoid robots are a great challenge in robotics. The concept of digital twin technology complies with the guiding ideology of not only Industry 4.0, but also Made in China 2025. This paper proposes a scheme that combines deep reinforcement learning (DRL) with digital twin technology for controlling humanoid robot arms. For rapid and stable motion planning for humanoid robots, multitasking-oriented training using the twin synchro-control (TSC) scheme with DRL is proposed. For switching between tasks, the robot arm training must be quick and diverse. In this work, an approach for obtaining a priori knowledge as input to DRL is developed and verified using simulations. Two simple examples are developed in a simulation environment. We developed a data acquisition system to generate angle data efficiently and automatically. These data are used to improve the reward function of the deep deterministic policy gradient (DDPG) and quickly train the robot for a task. The approach is applied to a model of the humanoid robot BHR-6, a humanoid robot with multiple-motion mode and a sophisticated mechanical structure. Using the policies trained in the simulations, the humanoid robot can perform tasks that are not possible to train with existing methods. The training is fast and allows the robot to perform multiple tasks. Our approach utilizes human joint angle data collected by the data acquisition system to solve the problem of a sparse reward in DRL for two simple tasks. A comparison with simulation results for controllers trained using the vanilla DDPG show that the designed controller developed using the DDPG with the TSC scheme have great advantages in terms of learning stability and convergence speed.

KW - Deep reinforcement learning

KW - Humanoid robot

KW - Twin synchro-control

UR - http://www.scopus.com/inward/record.url?scp=85086760300&partnerID=8YFLogxK

U2 - 10.3390/s20123515

DO - 10.3390/s20123515

M3 - Article

C2 - 32575907

AN - SCOPUS:85086760300

SN - 1424-8220

VL - 20

SP - 1

EP - 35

JO - Sensors

JF - Sensors

IS - 12

M1 - 3515

ER -

A multitasking-oriented robot arm motion planning scheme based on deep reinforcement learning and twin synchro-control

摘要

访问文件

其它文件与链接

指纹

引用此