HMDA: A Hybrid Model with Multi-Scale Deformable Attention for Medical Image Segmentation

Mengmeng Wu, Tiantian Liu*, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Transformers have been applied to medical image segmentation tasks owing to their excellent long-range modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channel-wise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.

Original languageEnglish
Pages (from-to)1243-1255
Number of pages13
JournalIEEE Journal of Biomedical and Health Informatics
Volume29
Issue number2
DOIs
Publication statusPublished - 2025

Keywords

  • Medical image segmentation
  • cross attention bridge
  • hybrid model
  • multi-scale deformable attention

Fingerprint

Dive into the research topics of 'HMDA: A Hybrid Model with Multi-Scale Deformable Attention for Medical Image Segmentation'. Together they form a unique fingerprint.

Cite this

Wu, M., Liu, T., Dai, X., Ye, C., Wu, J., Funahashi, S., & Yan, T. (2025). HMDA: A Hybrid Model with Multi-Scale Deformable Attention for Medical Image Segmentation. IEEE Journal of Biomedical and Health Informatics, 29(2), 1243-1255. https://doi.org/10.1109/JBHI.2024.3469230