TY - JOUR
T1 - D2FT-Net
T2 - Frequency-Spatial Dual Domain Fine-Tuning of Vision Foundation Models for Remote Sensing Domain Generalization Semantic Segmentation
AU - Shen, Zimeng
AU - Wang, Jue
AU - Wei, Tianyu
AU - Liu, Wenchao
AU - Lu, Xiaoyan
AU - Chen, Liang
N1 - Publisher Copyright:
© 2026 IEEE.
PY - 2026
Y1 - 2026
N2 - In recent years, the remote sensing domain generalization semantic segmentation has attracted increasing attention due to significant domain shifts caused by variations in sensors, imaging conditions, and geographic regions. Vision foundation models (VFMs) possess strong general-purpose feature extraction capabilities and can be effectively transferred across diverse data, showing great potential for remote sensing domain generalization and semantic segmentation. However, existing VFM-based spatial-domain parameter-efficient fine-tuning methods struggle to handle pronounced cross-domain intraclass variations in remote sensing. To address this issue, the adaptive frequency-aware adapter (AFA-Adapter) is proposed, which adaptively selects frequency components to improve cross-domain intraclass feature consistency. Building upon this, the spatial multiprototype adapter (SMP-Adapter) is proposed, which clusters multiple prototypes for features of each land-cover category to model complex intraclass diversity. Land-cover features are then weighted by their nearest intraclass prototype, thereby enhancing the discriminability of easily confused features at feature cluster boundaries. By integrating these two modules, we propose a frequency-spatial dual domain fine-tuning network (D2FT-Net), which effectively alleviates cross-domain intraclass variations and improves the generalization capability of VFMs for remote sensing domain generalization semantic segmentation. Extensive experiments under four cross-domain settings demonstrate the effectiveness of the proposed D2FT-Net, which achieves an average mIoU improvement of 1.09% over state-of-the-art methods, with the best gain reaching 1.64%. The source code will be released at https://github.com/ssshen0315/D2FT-Net
AB - In recent years, the remote sensing domain generalization semantic segmentation has attracted increasing attention due to significant domain shifts caused by variations in sensors, imaging conditions, and geographic regions. Vision foundation models (VFMs) possess strong general-purpose feature extraction capabilities and can be effectively transferred across diverse data, showing great potential for remote sensing domain generalization and semantic segmentation. However, existing VFM-based spatial-domain parameter-efficient fine-tuning methods struggle to handle pronounced cross-domain intraclass variations in remote sensing. To address this issue, the adaptive frequency-aware adapter (AFA-Adapter) is proposed, which adaptively selects frequency components to improve cross-domain intraclass feature consistency. Building upon this, the spatial multiprototype adapter (SMP-Adapter) is proposed, which clusters multiple prototypes for features of each land-cover category to model complex intraclass diversity. Land-cover features are then weighted by their nearest intraclass prototype, thereby enhancing the discriminability of easily confused features at feature cluster boundaries. By integrating these two modules, we propose a frequency-spatial dual domain fine-tuning network (D2FT-Net), which effectively alleviates cross-domain intraclass variations and improves the generalization capability of VFMs for remote sensing domain generalization semantic segmentation. Extensive experiments under four cross-domain settings demonstrate the effectiveness of the proposed D2FT-Net, which achieves an average mIoU improvement of 1.09% over state-of-the-art methods, with the best gain reaching 1.64%. The source code will be released at https://github.com/ssshen0315/D2FT-Net
KW - Domain generalization
KW - fine-tuning
KW - remote sensing
KW - semantic segmentation
KW - vision foundation models (VFMs)
UR - https://www.scopus.com/pages/publications/105038756946
U2 - 10.1109/TGRS.2026.3691111
DO - 10.1109/TGRS.2026.3691111
M3 - Article
AN - SCOPUS:105038756946
SN - 0196-2892
VL - 64
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5621016
ER -