TY - GEN
T1 - Emergency Evacuation Map Guided Navigation via Topological Alignment and VLM Reasoning
AU - Chen, Canzhi
AU - Huang, Weiqi
AU - Li, Jiaxin
AU - Wang, Zan
AU - Di, Huijun
AU - Liang, Wei
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
PY - 2026
Y1 - 2026
N2 - Traditional map-based navigation requires time-consuming mapping, while map-free methods often involve inefficient exploration. Both are ill-suited for time-critical scenarios such as emergency rescue. The readily available structural semantic map (eg, an evacuation map) is inherently well-suited for such scenarios, as it provides crucial geometric and semantic cues to support efficient navigation. However, applying the map to robot navigation tasks remains challenging due to the discrepancy between the map and the real environment, caused by the lack of metric information and inherent geometric distortions. To address these challenges, we propose ENAV, a unified framework that integrates room topology extraction, topology-based localization through alignment, and vision–language model (VLM)-guided planning to enable efficient navigation using evacuation maps. Specifically, given a target room, ENAV first extracts room topology from both the evacuation map and the real-time constructed metric map, and performs localization via topology alignment. It then employs a vision–language model (VLM) to generate intermediate sub-goals, and finally plans low-level actions to reach each sub-goal incrementally. Extensive experiments on our curated dataset demonstrate that our algorithm outperforms other baselines by a large margin in terms of SR and SPL metrics, highlighting the effectiveness and efficiency of the proposed framework.
AB - Traditional map-based navigation requires time-consuming mapping, while map-free methods often involve inefficient exploration. Both are ill-suited for time-critical scenarios such as emergency rescue. The readily available structural semantic map (eg, an evacuation map) is inherently well-suited for such scenarios, as it provides crucial geometric and semantic cues to support efficient navigation. However, applying the map to robot navigation tasks remains challenging due to the discrepancy between the map and the real environment, caused by the lack of metric information and inherent geometric distortions. To address these challenges, we propose ENAV, a unified framework that integrates room topology extraction, topology-based localization through alignment, and vision–language model (VLM)-guided planning to enable efficient navigation using evacuation maps. Specifically, given a target room, ENAV first extracts room topology from both the evacuation map and the real-time constructed metric map, and performs localization via topology alignment. It then employs a vision–language model (VLM) to generate intermediate sub-goals, and finally plans low-level actions to reach each sub-goal incrementally. Extensive experiments on our curated dataset demonstrate that our algorithm outperforms other baselines by a large margin in terms of SR and SPL metrics, highlighting the effectiveness and efficiency of the proposed framework.
UR - https://www.scopus.com/pages/publications/105028456919
U2 - 10.1007/978-981-95-5679-3_36
DO - 10.1007/978-981-95-5679-3_36
M3 - Conference contribution
AN - SCOPUS:105028456919
SN - 9789819556786
T3 - Lecture Notes in Computer Science
SP - 519
EP - 533
BT - Pattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Proceedings
A2 - Kittler, Josef
A2 - Xiong, Hongkai
A2 - Yang, Jian
A2 - Chen, Xilin
A2 - Lu, Jiwen
A2 - Lin, Weiyao
A2 - Yu, Jingyi
A2 - Zheng, Weishi
PB - Springer Science and Business Media Deutschland GmbH
T2 - 8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025
Y2 - 15 October 2025 through 18 October 2025
ER -