跳到主要导航 跳到搜索 跳到主要内容

OpenVox: Real-time Instance-level Open-vocabulary Probabilistic Voxel Representation

  • Yinan Deng
  • , Bicheng Yao
  • , Yihang Tang
  • , Tianxing Zhou
  • , Yi Yang
  • , Yufeng Yue*
  • *此作品的通讯作者
  • Beijing Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In recent years, vision-language models (VLMs) have advanced open-vocabulary mapping, enabling mobile robots to simultaneously achieve environmental reconstruction and high-level semantic understanding. While integrated object cognition helps mitigate semantic ambiguity in point-wise feature maps, efficiently obtaining rich semantic understanding and robust incremental reconstruction at the instance-level remains challenging. To address these challenges, we introduce OpenVox, a real-time incremental open-vocabulary probabilistic instance voxel representation. In the front-end, we design an efficient instance segmentation and comprehension pipeline that enhances language reasoning through encoding captions. In the back-end, we implement probabilistic instance voxels and formulate the cross-frame incremental fusion process into two subtasks: instance association and live map evolution, ensuring robustness to sensor and segmentation noise. Extensive evaluations across multiple datasets demonstrate that OpenVox achieves state-of-the-art performance in zero-shot instance segmentation, semantic segmentation, and open-vocabulary retrieval. The project page of OpenVox is available at https://open-vox.github.io/.

源语言英语
主期刊名IROS 2025 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Conference Proceedings
编辑Christian Laugier, Alessandro Renzaglia, Nikolay Atanasov, Stan Birchfield, Grzegorz Cielniak, Leonardo De Mattos, Laura Fiorini, Philippe Giguere, Kenji Hashimoto, Javier Ibanez-Guzman, Tetsushi Kamegawa, Jinoh Lee, Giuseppe Loianno, Kevin Luck, Hisataka Maruyama, Philippe Martinet, Hadi Moradi, Urbano Nunes, Julien Pettre, Alberto Pretto, Tommaso Ranzani, Arne Ronnau, Silvia Rossi, Elliott Rouse, Fabio Ruggiero, Olivier Simonin, Danwei Wang, Ming Yang, Eiichi Yoshida, Huijing Zhao
出版商Institute of Electrical and Electronics Engineers Inc.
1305-1311
页数7
ISBN(电子版)9798331543938
DOI
出版状态已出版 - 2025
已对外发布
活动2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025 - Hangzhou, 中国
期限: 19 10月 202525 10月 2025

出版系列

姓名IEEE International Conference on Intelligent Robots and Systems
ISSN(印刷版)2153-0858
ISSN(电子版)2153-0866

会议

会议2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
国家/地区中国
Hangzhou
时期19/10/2525/10/25

指纹

探究 'OpenVox: Real-time Instance-level Open-vocabulary Probabilistic Voxel Representation' 的科研主题。它们共同构成独一无二的指纹。

引用此