Structured Scene Memory for Vision-Language Navigation

Hanqing Wang, Wenguan Wang*, Wei Liang*, Caiming Xiong, Jianbing Shen

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

75 引用 (Scopus)

摘要

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. However, current VLN agents simply store their past experiences/observations as latent states in recurrent networks, failing to capture environment layouts and make long-term planning. To address these limitations, we propose a crucial architecture, called Structured Scene Memory (SSM). It is compartmentalized enough to accurately memorize the percepts during navigation. It also serves as a structured scene representation, which captures and disentangles visual and geometric cues in the environment. SSM has a collect-read controller that adaptively collects information for supporting current decision making and mimics iterative algorithms for long-range reasoning. As SSM provides a complete action space, i.e., all the navigable places on the map, a frontier-exploration based navigation decision making strategy is introduced to enable efficient and global planning. Experiment results on two VLN datasets (i.e., R2R and R4R) show that our method achieves state-of-the-art performance on several metrics.

源语言英语
主期刊名Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
出版商IEEE Computer Society
8451-8460
页数10
ISBN(电子版)9781665445092
DOI
出版状态已出版 - 2021
活动2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, 美国
期限: 19 6月 202125 6月 2021

出版系列

姓名Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN(印刷版)1063-6919

会议

会议2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
国家/地区美国
Virtual, Online
时期19/06/2125/06/21

指纹

探究 'Structured Scene Memory for Vision-Language Navigation' 的科研主题。它们共同构成独一无二的指纹。

引用此