TY - GEN
T1 - MDFNet
T2 - 2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
AU - Wei, Tianyu
AU - Chen, He
AU - Wang, Jue
AU - Liu, Wenchao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The semantic segmentation of multimodal remote sensing (RS) images, utilizing optical and synthetic aperture radar (SAR) data, has raised attention in recent studies. Advanced studies adaptively modeled and fused modality-share and modality-specific information beneficial for segmentation and achieved competitive performance. However, in challenging scenarios such as low image contrast or blurred textures in optical images and speckle noises or foreshortening in SAR images, modality-specific information may lost. This study proposed a multimodal feature decomposition and fusion network (MDFNet) designed for multimodal RS image semantic segmentation using optical and SAR images. By decomposing multimodal features into modality-share and modality-difference features and applying gradient descent on modality-difference features, modality-specific information beneficial for segmentation can be retained. Specifically, we designed a MDF decoder with MDF blocks. MDF block maps multimodal features into the same feature space and calculates the difference of multimodal features to obtain modality-share and modality-difference features respectively, then fuses these features. MDF decoder optimizes modality-difference features by gradient descent using adaptive modeling, thereby retaining modality-specific information that is beneficial for segmentation. Comprehensive experiments conducted on the DFC20 dataset demonstrated that the proposed MDFNet surpasses representative methods.
AB - The semantic segmentation of multimodal remote sensing (RS) images, utilizing optical and synthetic aperture radar (SAR) data, has raised attention in recent studies. Advanced studies adaptively modeled and fused modality-share and modality-specific information beneficial for segmentation and achieved competitive performance. However, in challenging scenarios such as low image contrast or blurred textures in optical images and speckle noises or foreshortening in SAR images, modality-specific information may lost. This study proposed a multimodal feature decomposition and fusion network (MDFNet) designed for multimodal RS image semantic segmentation using optical and SAR images. By decomposing multimodal features into modality-share and modality-difference features and applying gradient descent on modality-difference features, modality-specific information beneficial for segmentation can be retained. Specifically, we designed a MDF decoder with MDF blocks. MDF block maps multimodal features into the same feature space and calculates the difference of multimodal features to obtain modality-share and modality-difference features respectively, then fuses these features. MDF decoder optimizes modality-difference features by gradient descent using adaptive modeling, thereby retaining modality-specific information that is beneficial for segmentation. Comprehensive experiments conducted on the DFC20 dataset demonstrated that the proposed MDFNet surpasses representative methods.
KW - Modality-specific information
KW - multimodal remote sensing image semantic segmentation
KW - remote sensing
KW - synthetic aperture radar (SAR)
UR - http://www.scopus.com/inward/record.url?scp=86000026803&partnerID=8YFLogxK
U2 - 10.1109/ICSIDP62679.2024.10868650
DO - 10.1109/ICSIDP62679.2024.10868650
M3 - Conference contribution
AN - SCOPUS:86000026803
T3 - IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
BT - IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 November 2024 through 24 November 2024
ER -