Abstract
Semantic segmentation and cloud removal via optical-Synthetic Aperture Radar (SAR) fusion are crucial tasks in Earth observation. Existing works have proposed multitask architectures to simultaneously address segmentation and cloud removal, but they lack targeted modeling of under-cloud features, leading to blurred ground-object reconstruction and poor fine-grained recognition performance in thick-cloud regions. To address these challenges, this article proposes a Cloud-Guided Mamba-based Multitask Network (CGM2Net) for end-to-end multimodal segmentation and cloud removal. Specifically, CGM2Net utilizes a dual-stream encoder and integrates three key modules at each stage: cloud-guided feature restoration module, which, under cloud-mask-guided gating, exploits spatial- and channel-wise cross-modal correlations to reconstruct sub-cloud optical features and suppress SAR speckle; cloud-guided semantic interaction module, which performs mask-aware state-space modulation to enable selective, region-aware cross-modal semantic exchange; and Mamba-Based Fusion module, which adaptively fuses enhanced multimodal features to fully exploit modal complementary information. Two task-specific decoders synergistically optimize segmentation and cloud removal, thereby promoting mutual enhancement between the semantic prior and visual restoration. Experiments on M3M-CR and LuojiaSET-OSFCR demonstrate that CGM2Net achieves state-of-the-art performance on both tasks. Ablation studies and feature visualizations further validate the complementary roles and effectiveness of the proposed modules.
| Original language | English |
|---|---|
| Pages (from-to) | 14358-14374 |
| Number of pages | 17 |
| Journal | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Volume | 19 |
| DOIs | |
| Publication status | Published - 2026 |
| Externally published | Yes |
Keywords
- Cloud removal
- multimodal segmentation
- multitask learning
- remote sensing data
Fingerprint
Dive into the research topics of 'CGM2Net: Cloud-Guided Mamba-Based Multitask Network for Multimodal Remote Sensing Semantic Segmentation and Cloud Removal'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver