跳到主要导航 跳到搜索 跳到主要内容

D2FT-Net: Frequency-Spatial Dual Domain Fine-Tuning of Vision Foundation Models for Remote Sensing Domain Generalization Semantic Segmentation

  • Beijing Institute of Technology
  • National University of Singapore

科研成果: 期刊稿件文章同行评审

摘要

In recent years, the remote sensing domain generalization semantic segmentation has attracted increasing attention due to significant domain shifts caused by variations in sensors, imaging conditions, and geographic regions. Vision foundation models (VFMs) possess strong general-purpose feature extraction capabilities and can be effectively transferred across diverse data, showing great potential for remote sensing domain generalization and semantic segmentation. However, existing VFM-based spatial-domain parameter-efficient fine-tuning methods struggle to handle pronounced cross-domain intraclass variations in remote sensing. To address this issue, the adaptive frequency-aware adapter (AFA-Adapter) is proposed, which adaptively selects frequency components to improve cross-domain intraclass feature consistency. Building upon this, the spatial multiprototype adapter (SMP-Adapter) is proposed, which clusters multiple prototypes for features of each land-cover category to model complex intraclass diversity. Land-cover features are then weighted by their nearest intraclass prototype, thereby enhancing the discriminability of easily confused features at feature cluster boundaries. By integrating these two modules, we propose a frequency-spatial dual domain fine-tuning network (D2FT-Net), which effectively alleviates cross-domain intraclass variations and improves the generalization capability of VFMs for remote sensing domain generalization semantic segmentation. Extensive experiments under four cross-domain settings demonstrate the effectiveness of the proposed D2FT-Net, which achieves an average mIoU improvement of 1.09% over state-of-the-art methods, with the best gain reaching 1.64%. The source code will be released at https://github.com/ssshen0315/D2FT-Net

源语言英语
文章编号5621016
期刊IEEE Transactions on Geoscience and Remote Sensing
64
DOI
出版状态已出版 - 2026

指纹

探究 'D2FT-Net: Frequency-Spatial Dual Domain Fine-Tuning of Vision Foundation Models for Remote Sensing Domain Generalization Semantic Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此