TY - JOUR
T1 - Toward Multi-Task Generalization in Autonomous Navigation
T2 - A Human-in-the-Loop Adversarial Reinforcement Learning With Diffusion Policy
AU - Hu, Dong
AU - Huang, Chao
AU - Wu, Jingda
AU - Yuan, Xin
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - Due to the complexity and variability of real-world environments, data-driven autonomous navigation strategies for autonomous ground vehicles have significant potential to improve performance and adaptability in diverse scenarios. Reinforcement learning (RL) has emerged as a promising approach for autonomous navigation. However, existing RL methods often struggle with low sample efficiency, limited adaptability, and poor generalization in dynamic multi-task scenarios. To address these issues, we propose a novel framework: human-in-the-loop adversarial RL with diffusion policy, designed for scalable and robust policy learning. This framework leverages a diffusion model as policy network, effectively exploring and learning high-dimensional, multi-modal behavior distributions. It also integrates human feedback to improve data efficiency and stabilize policy training. On top of this, adversarial training is employed to improve robustness and adaptability to change in tasks and distributions. The proposed method is trained in simulation, and then the well-trained policy is transferred to the real-world. Experimental results demonstrate that this approach significantly outperforms existing methods in terms of efficiency, stability, generalization, and multi-task adaptability, offering a promising solution for the next generation of autonomous navigation systems.
AB - Due to the complexity and variability of real-world environments, data-driven autonomous navigation strategies for autonomous ground vehicles have significant potential to improve performance and adaptability in diverse scenarios. Reinforcement learning (RL) has emerged as a promising approach for autonomous navigation. However, existing RL methods often struggle with low sample efficiency, limited adaptability, and poor generalization in dynamic multi-task scenarios. To address these issues, we propose a novel framework: human-in-the-loop adversarial RL with diffusion policy, designed for scalable and robust policy learning. This framework leverages a diffusion model as policy network, effectively exploring and learning high-dimensional, multi-modal behavior distributions. It also integrates human feedback to improve data efficiency and stabilize policy training. On top of this, adversarial training is employed to improve robustness and adaptability to change in tasks and distributions. The proposed method is trained in simulation, and then the well-trained policy is transferred to the real-world. Experimental results demonstrate that this approach significantly outperforms existing methods in terms of efficiency, stability, generalization, and multi-task adaptability, offering a promising solution for the next generation of autonomous navigation systems.
KW - adversarial training
KW - Autonomous navigation
KW - diffusion policy
KW - human-in-the-loop reinforcement learning
KW - multi-task generalization
UR - https://www.scopus.com/pages/publications/105012463921
U2 - 10.1109/TITS.2025.3591239
DO - 10.1109/TITS.2025.3591239
M3 - Article
AN - SCOPUS:105012463921
SN - 1524-9050
VL - 26
SP - 19493
EP - 19507
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
IS - 11
ER -