Skip to main navigation Skip to search Skip to main content

MMSegRWKV: Enhancing Multimodal MRI Segmentation for Internet of Medical Things-Enabled Healthcare with RWKV-Inspired Architectures

Research output: Contribution to journalArticlepeer-review

Abstract

Magnetic resonance imaging (MRI) plays a crucial role in clinical diagnostics. Unlike single sequence MRI, multimodal MRI integrates complementary information from different sequences such as T1, T2, and FLAIR sequences, offering richer structural and pathological insights while increasing modeling complexity. In the Internet of Medical Things (IoMT) environment, massive multimodal MRI data are continuously collected from distributed hospitals and remote monitoring systems, creating an urgent need for segmentation models that can effectively fuse multimodal information while remaining efficient and scalable under resource constraints. To meet these challenges, we develop MMSegRWKV, a multimodal 3D segmentation framework that combines the efficiency of the Receptance Weighted Key Value (RWKV) model with a U-shaped architecture. The framework achieves linear complexity modeling of volumetric contexts and is further enhancement via Dual-View WKV, which factorizes dependencies along spatial and temporal dimensions and integrates through a learnable weighting scheme. Inspired by clinical practice, a dynamic short time window is introduced along the slice axis to achieve a balance between local discriminative focus and global dependency modeling. On the multimodal side, rather than stacking sequences, we incorporate a Residual Factorization Machine (ResFM) to capture second order cross modal interactions, ensuring that complementary features across modalities are explicitly modeled. In addition, a Bi-Shift scheme is incorporated to separately consider preceding and succeeding tokens for both temporal and spatial relations, thereby improving sequential representation learning. Extensive evaluation on three public benchmarks and an in-house dataset demonstrates that MMSegRWKV not only improves segmentation accuracy and efficiency, but also achieves lower computational complexity, reduced memory footprint, and faster inference compared with representative CNN-, Transformer-, and Mamba-based models. These results highlight its strong potential for practical deployment in IoMT scenarios, enabling scalable, low-latency, and resource-conscious medical imaging solutions. The code and models are publicly available at link.

Original languageEnglish
JournalIEEE Internet of Things Journal
DOIs
Publication statusAccepted/In press - 2026
Externally publishedYes

Keywords

  • Internet of Medical Things
  • Medical Image Segmentation
  • RWKV
  • multimodal MRI

Fingerprint

Dive into the research topics of 'MMSegRWKV: Enhancing Multimodal MRI Segmentation for Internet of Medical Things-Enabled Healthcare with RWKV-Inspired Architectures'. Together they form a unique fingerprint.

Cite this