TY - JOUR
T1 - EMKG
T2 - Embodied Memory Knowledge Graphs for Object-Goal Navigation in Dynamic Open Worlds
AU - Li, Mingyi
AU - Liu, Hui
AU - Li, Ying
AU - Zhang, Shubo
AU - Gao, Chunle
AU - Ma, Xiaokang
AU - Hu, Hanqing
AU - Mao, Weixin
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2026
Y1 - 2026
N2 - Object-Goal Navigation (OGN) in complex domestic environments remains challenging due to spatial memory and semantic uncertainties. To address this, we introduce EMKG, an embodied multimodal memory knowledge graph framework that enables open-world navigation. In contrast to conventional vision-language navigation (VLN) methods that depend on explicit step-by-step instructions, EMKG utilizes a multimodal memory mechanism for object navigation based on semantic retrieval, which supports dynamic perception, semantic reasoning, and adaptive control. Specifically, to bridge quadruped robot memory and cognition, EMKG's core component, the Dynamic Memory Knowledge Graph (DMKG), integrates visual inputs, 3D spatial coordinates, and target captions to generalize cross-modal feature correspondences. To ensure the knowledge base remains accurate and actionable over time, we design a visual sampling-based retrieval-augmented generation (RAG) updating evaluation strategy that continuously assesses target reachability and incrementally updates the multimodal vector memory. EMKG further employs visual PPO-CLIP adaptive planning, visual pattern recognition, and modality-switching locomotion control to unify multimodal mapping and memory update processes within a single end-to-end pipeline. EMKG establishes stable semantic associations across different modalities and is deployed end-to-end on a quadruped robot platform, supporting low-latency autonomous decision-making and obstacle avoidance. Extensive experiments show that EMKG outperforms baseline methods, achieving an average 6.7-12.6% higher Success Rate (SR) and 7.3-15.6% improved Success-weighted Path Length (SPL) on the Habitat simulator and in physical environments. These results validate EMKG's efficacy in memory-enhanced semantic reasoning and embodied ObjectNav in domestic open-world environments.
AB - Object-Goal Navigation (OGN) in complex domestic environments remains challenging due to spatial memory and semantic uncertainties. To address this, we introduce EMKG, an embodied multimodal memory knowledge graph framework that enables open-world navigation. In contrast to conventional vision-language navigation (VLN) methods that depend on explicit step-by-step instructions, EMKG utilizes a multimodal memory mechanism for object navigation based on semantic retrieval, which supports dynamic perception, semantic reasoning, and adaptive control. Specifically, to bridge quadruped robot memory and cognition, EMKG's core component, the Dynamic Memory Knowledge Graph (DMKG), integrates visual inputs, 3D spatial coordinates, and target captions to generalize cross-modal feature correspondences. To ensure the knowledge base remains accurate and actionable over time, we design a visual sampling-based retrieval-augmented generation (RAG) updating evaluation strategy that continuously assesses target reachability and incrementally updates the multimodal vector memory. EMKG further employs visual PPO-CLIP adaptive planning, visual pattern recognition, and modality-switching locomotion control to unify multimodal mapping and memory update processes within a single end-to-end pipeline. EMKG establishes stable semantic associations across different modalities and is deployed end-to-end on a quadruped robot platform, supporting low-latency autonomous decision-making and obstacle avoidance. Extensive experiments show that EMKG outperforms baseline methods, achieving an average 6.7-12.6% higher Success Rate (SR) and 7.3-15.6% improved Success-weighted Path Length (SPL) on the Habitat simulator and in physical environments. These results validate EMKG's efficacy in memory-enhanced semantic reasoning and embodied ObjectNav in domestic open-world environments.
KW - Object-goal navigation
KW - Quadruped robot
KW - Retrieval-Augmented Generation
UR - https://www.scopus.com/pages/publications/105028209557
U2 - 10.1109/LRA.2026.3655297
DO - 10.1109/LRA.2026.3655297
M3 - Article
AN - SCOPUS:105028209557
SN - 2377-3766
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
ER -