MSHI-Mamba: A Multi-Stage Hierarchical Interaction Model for 3D Point Clouds Based on Mamba

  • Zhiguo Zhou*
  • , Qian Wang
  • , Xuehua Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Mamba, based on the state space model (SSM), offers an efficient alternative to the quadratic complexity of attention, showing promise for long-sequence data processing and global modeling in 3D object detection. However, applying it to this domain presents specific challenges: traditional serialization methods can compromise the spatial structure of 3D data, and the standard single-layer SSM design may limit cross-layer feature extraction. To address these issues, this paper proposes MSHI-Mamba, a Mamba-based multi-stage hierarchical interaction architecture for 3D backbone networks. We introduce a cross-layer complementary cross-attention module (C3AM) to mitigate feature redundancy in cross-layer encoding, as well as a bi-shift scanning strategy (BSS) that uses hybrid space-filling curves with shift scanning to better preserve spatial continuity and expand the receptive field during serialization. We also develop a voxel densifying downsampling module (VD-DS) to enhance local spatial information and foreground feature density. Experimental results obtained on the KITTI and nuScenes datasets demonstrate that our approach achieves competitive performance, with a 4.2% improvement in the mAP on KITTI, validating the effectiveness of the proposed components.

Original languageEnglish
Article number1189
JournalApplied Sciences (Switzerland)
Volume16
Issue number3
DOIs
Publication statusPublished - Feb 2026
Externally publishedYes

Keywords

  • 3D object detection
  • Mamba
  • autonomous driving
  • cross-attention mechanism
  • space-filling curve
  • state space model

Fingerprint

Dive into the research topics of 'MSHI-Mamba: A Multi-Stage Hierarchical Interaction Model for 3D Point Clouds Based on Mamba'. Together they form a unique fingerprint.

Cite this