STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner

  • Tianxing Zhou
  • , Zhirui Wang
  • , Haojia Ao
  • , Guangyan Chen
  • , Boyang Xing
  • , Jingwen Cheng
  • , Yi Yang
  • , Yufeng Yue*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability to perform reliable long-horizon task planning is crucial for deploying robots in real-world environments. However, directly employing Large Language Models (LLMs) as action sequence generators often results in low success rates due to their limited reasoning ability for long-horizon embodied tasks. In the STEP framework, we construct a subgoal tree through a pair of closed-loop models: a subgoal decomposition model and a leaf node termination model. Within this framework, we develop a hierarchical tree structure that spans from coarse to fine resolutions. The subgoal decomposition model leverages a foundation LLM to break down complex goals into manageable subgoals, thereby spanning the subgoal tree. The leaf node termination model provides real-time feedback based on environmental states, determining when to terminate the tree spanning and ensuring each leaf node can be directly converted into a primitive action. Experiments conducted in both the VirtualHome WAH-NL benchmark and on real robots demonstrate that STEP achieves long-horizon embodied task completion with success rates up to 34% (WAH-NL) and 25% (real robot) outperforming SOTA methods.

Original languageEnglish
Title of host publicationIROS 2025 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Conference Proceedings
EditorsChristian Laugier, Alessandro Renzaglia, Nikolay Atanasov, Stan Birchfield, Grzegorz Cielniak, Leonardo De Mattos, Laura Fiorini, Philippe Giguere, Kenji Hashimoto, Javier Ibanez-Guzman, Tetsushi Kamegawa, Jinoh Lee, Giuseppe Loianno, Kevin Luck, Hisataka Maruyama, Philippe Martinet, Hadi Moradi, Urbano Nunes, Julien Pettre, Alberto Pretto, Tommaso Ranzani, Arne Ronnau, Silvia Rossi, Elliott Rouse, Fabio Ruggiero, Olivier Simonin, Danwei Wang, Ming Yang, Eiichi Yoshida, Huijing Zhao
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages16731-16738
Number of pages8
ISBN (Electronic)9798331543938
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025 - Hangzhou, China
Duration: 19 Oct 202525 Oct 2025

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
Country/TerritoryChina
CityHangzhou
Period19/10/2525/10/25

Fingerprint

Dive into the research topics of 'STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner'. Together they form a unique fingerprint.

Cite this