TY - JOUR
T1 - AD-DUNet
T2 - A dual-branch encoder approach by combining axial Transformer with cascaded dilated convolutions for liver and hepatic tumor segmentation
AU - Qi, Hang
AU - Wang, Weijiang
AU - Shi, Yueting
AU - Wang, Xiaohua
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/9
Y1 - 2024/9
N2 - Liver cancer remains a significant health concern, and accurate segmentation in CT scans is crucial for diagnosis and treatment. Deep learning-based auxiliary diagnosis techniques, especially utilizing U-shaped structures, are widely employed in medical image segmentation. However, traditional methods that utilize Convolutional Neural Networks (CNNs) generally have limitations in modeling long-range dependencies. Inspired by the success of Transformers in various vision tasks, approaches that combine Transformers with CNNs have been spurred. However, many existing hybrid CNN-Transformer models are prone to yielding poor performance on relative small-scale medical image datasets when trained from scratch. Moreover, some of these methods involve additional fusion modules customized, which introduce extra workload and parameters to the model. To address these limitations, we propose AD-DUNet, a hybrid CNN-Transformer model for liver and hepatic tumor segmentation, which comprises a dual-branch encoder and a residual decoder. The Transformer-based encoder, utilizing Axial Transformer (AT) blocks, efficiently captures long-range dependencies across the entire image, while the CNN-based encoder, constructed with cascaded dilated convolutions (CDC) blocks, extracts fine-grained local features. The two encoders synergize in the shared residual decoder, eliminating the need for additional fusion modules. The extensive experiments conducted on the LiTS2017 and 3DIRCAD datasets demonstrate the superiority of AD-DUNet over existing models. Remarkably, our approach achieves state-of-the-art results without relying on pre-trained weights, showcasing its efficiency with low complexity and 4.24M parameters.
AB - Liver cancer remains a significant health concern, and accurate segmentation in CT scans is crucial for diagnosis and treatment. Deep learning-based auxiliary diagnosis techniques, especially utilizing U-shaped structures, are widely employed in medical image segmentation. However, traditional methods that utilize Convolutional Neural Networks (CNNs) generally have limitations in modeling long-range dependencies. Inspired by the success of Transformers in various vision tasks, approaches that combine Transformers with CNNs have been spurred. However, many existing hybrid CNN-Transformer models are prone to yielding poor performance on relative small-scale medical image datasets when trained from scratch. Moreover, some of these methods involve additional fusion modules customized, which introduce extra workload and parameters to the model. To address these limitations, we propose AD-DUNet, a hybrid CNN-Transformer model for liver and hepatic tumor segmentation, which comprises a dual-branch encoder and a residual decoder. The Transformer-based encoder, utilizing Axial Transformer (AT) blocks, efficiently captures long-range dependencies across the entire image, while the CNN-based encoder, constructed with cascaded dilated convolutions (CDC) blocks, extracts fine-grained local features. The two encoders synergize in the shared residual decoder, eliminating the need for additional fusion modules. The extensive experiments conducted on the LiTS2017 and 3DIRCAD datasets demonstrate the superiority of AD-DUNet over existing models. Remarkably, our approach achieves state-of-the-art results without relying on pre-trained weights, showcasing its efficiency with low complexity and 4.24M parameters.
KW - Convolutional neural network
KW - Deep learning
KW - Dual-branch encoder
KW - Medical image segmentation
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85191835975&partnerID=8YFLogxK
U2 - 10.1016/j.bspc.2024.106397
DO - 10.1016/j.bspc.2024.106397
M3 - Article
AN - SCOPUS:85191835975
SN - 1746-8094
VL - 95
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 106397
ER -