TY - JOUR
T1 - Adaptive Gait Acquisition through Learning Dynamic Stimulus Instinct of Bipedal Robot
AU - Zhang, Yuanxi
AU - Chen, Xuechao
AU - Meng, Fei
AU - Yu, Zhangguo
AU - Du, Yidong
AU - Zhou, Zishun
AU - Gao, Junyao
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/6
Y1 - 2024/6
N2 - Standard alternating leg motions serve as the foundation for simple bipedal gaits, and the effectiveness of the fixed stimulus signal has been proved in recent studies. However, in order to address perturbations and imbalances, robots require more dynamic gaits. In this paper, we introduce dynamic stimulus signals together with a bipedal locomotion policy into reinforcement learning (RL). Through the learned stimulus frequency policy, we induce the bipedal robot to obtain both three-dimensional (3D) locomotion and an adaptive gait under disturbance without relying on an explicit and model-based gait in both the training stage and deployment. In addition, a set of specialized reward functions focusing on reliable frequency reflections is used in our framework to ensure correspondence between locomotion features and the dynamic stimulus. Moreover, we demonstrate efficient sim-to-real transfer, making a bipedal robot called BITeno achieve robust locomotion and disturbance resistance, even in extreme situations of foot sliding in the real world. In detail, under a sudden change in torso velocity of (Formula presented.) m/s in 0.65 s, the recovery time is within 1.5–2.0 s.
AB - Standard alternating leg motions serve as the foundation for simple bipedal gaits, and the effectiveness of the fixed stimulus signal has been proved in recent studies. However, in order to address perturbations and imbalances, robots require more dynamic gaits. In this paper, we introduce dynamic stimulus signals together with a bipedal locomotion policy into reinforcement learning (RL). Through the learned stimulus frequency policy, we induce the bipedal robot to obtain both three-dimensional (3D) locomotion and an adaptive gait under disturbance without relying on an explicit and model-based gait in both the training stage and deployment. In addition, a set of specialized reward functions focusing on reliable frequency reflections is used in our framework to ensure correspondence between locomotion features and the dynamic stimulus. Moreover, we demonstrate efficient sim-to-real transfer, making a bipedal robot called BITeno achieve robust locomotion and disturbance resistance, even in extreme situations of foot sliding in the real world. In detail, under a sudden change in torso velocity of (Formula presented.) m/s in 0.65 s, the recovery time is within 1.5–2.0 s.
KW - adaptive locomotion
KW - bipedal robot
KW - period dynamic gait
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85197924027&partnerID=8YFLogxK
U2 - 10.3390/biomimetics9060310
DO - 10.3390/biomimetics9060310
M3 - Article
AN - SCOPUS:85197924027
SN - 2313-7673
VL - 9
JO - Biomimetics
JF - Biomimetics
IS - 6
M1 - 310
ER -