TY - JOUR
T1 - Online Behavior-Centric Adaptation for Bipedal Robot Sim-to-Real Transfer With Unmodeled Dynamics Mismatch
AU - Chen, Xuechao
AU - Du, Yidong
AU - Zhou, Zishun
AU - Yuan, Zhicheng
AU - Zhao, Qingrui
AU - Meng, Fei
AU - Yu, Zhangguo
AU - Lu, Peng
AU - Huang, Qiang
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Bipedal robots have achieved remarkable locomotion capabilities through reinforcement learning (RL), yet their real-world deployment remains hindered by the sim-to-real gap—dynamics mismatches between simulation and reality that degrade locomotion performance through behavioral deviations. This work introduces an online behavior adaptation framework that bridges this gap at the behavioral level by dynamically aligning emergent locomotion strategies with simulation-derived objectives. Our method integrates two core innovations: (1) a structured latent space constructed via an augmented Variational Autoencoder (VAE), which quantifies behavioral divergence through domain-invariant representations of locomotion patterns, and (2) a closed-loop adaptation module that maps latent-space deviations to real-time adjustments in low-level controller parameters. By reformulating sim-to-real transfer as a problem of behavioral alignment rather than explicit dynamics matching, the framework enables continuous adaptation to unmodeled dynamics mismatch without requiring system identification or offline retraining. Extensive experimental evaluations demonstrate the effectiveness of the proposed method, highlighting its potential to bridge the behavior gap between simulation and reality. Note to Practitioners—This paper was motivated by the problem that the legged robot locomotion with learning-based controller could suffer performance drop due to the sim-to-real gap, which is directly reflected on the behavior deviation between the simulated and real-world robot. Traditional methods require heavy man-crafted parameter tuning. This paper proposes a novel online behavior adaptation framework to alleviate the sim-to-real gap, using a trained robot behavior encoding network and a behavior adaptation network. This framework enables the robot to detects behavioral deviations and adjusts low-level control parameters automatically. The simulated and real-world experiments suggest the proposed framework is feasible to reduce the behavior deviation of real-world robot locomotion with the simulated robot with unmodeled dynamics mismatch by self-correcting locomotion strategies in real time when faced with unexpected disturbances, and reduce manual tuning efforts. But it has not yet been validated on complex terrains, so in future research, we will further validate the proposed framework on complex terrains.
AB - Bipedal robots have achieved remarkable locomotion capabilities through reinforcement learning (RL), yet their real-world deployment remains hindered by the sim-to-real gap—dynamics mismatches between simulation and reality that degrade locomotion performance through behavioral deviations. This work introduces an online behavior adaptation framework that bridges this gap at the behavioral level by dynamically aligning emergent locomotion strategies with simulation-derived objectives. Our method integrates two core innovations: (1) a structured latent space constructed via an augmented Variational Autoencoder (VAE), which quantifies behavioral divergence through domain-invariant representations of locomotion patterns, and (2) a closed-loop adaptation module that maps latent-space deviations to real-time adjustments in low-level controller parameters. By reformulating sim-to-real transfer as a problem of behavioral alignment rather than explicit dynamics matching, the framework enables continuous adaptation to unmodeled dynamics mismatch without requiring system identification or offline retraining. Extensive experimental evaluations demonstrate the effectiveness of the proposed method, highlighting its potential to bridge the behavior gap between simulation and reality. Note to Practitioners—This paper was motivated by the problem that the legged robot locomotion with learning-based controller could suffer performance drop due to the sim-to-real gap, which is directly reflected on the behavior deviation between the simulated and real-world robot. Traditional methods require heavy man-crafted parameter tuning. This paper proposes a novel online behavior adaptation framework to alleviate the sim-to-real gap, using a trained robot behavior encoding network and a behavior adaptation network. This framework enables the robot to detects behavioral deviations and adjusts low-level control parameters automatically. The simulated and real-world experiments suggest the proposed framework is feasible to reduce the behavior deviation of real-world robot locomotion with the simulated robot with unmodeled dynamics mismatch by self-correcting locomotion strategies in real time when faced with unexpected disturbances, and reduce manual tuning efforts. But it has not yet been validated on complex terrains, so in future research, we will further validate the proposed framework on complex terrains.
KW - Reinforcement learning
KW - bipedal robot
KW - sim-to-real
UR - https://www.scopus.com/pages/publications/105026234817
U2 - 10.1109/TASE.2025.3648835
DO - 10.1109/TASE.2025.3648835
M3 - Article
AN - SCOPUS:105026234817
SN - 1545-5955
VL - 23
SP - 1533
EP - 1545
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -