Unsupervised Domain Adaptation With Hierarchical Masked Dual-Adversarial Network for End-to-End Classification of Multisource Remote Sensing Data

Wen Shuai Hu, Wei Li, Heng Chao Li, Xudong Zhao, Mengmeng Zhang*, Ran Tao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Although unsupervised domain adaptation (UDA) has been successfully applied for cross-scene classification of multisource remote sensing (MSRS) data, there are still some tough issues: 1) the vast majority of them are patch-based, requiring pixel-by-pixel processing at high complexity and ignoring the roles of unlabeled data between different domains and 2) traditional masked autoencoder (MAE)-based methods lack effective multiscale analysis and require pre-training, ignoring the roles of low-level representations. As such, a hierarchical masked dual-adversarial DA network (HMDA-DANet) is proposed for cross-domain end-to-end classification of MSRS data. First, a hierarchical asymmetric MAE (HAMAE) without pre-training is designed, containing a frequency dynamic large-scale convolutional (FDLConv) block to enhance important structural information in the frequency domain, and an intramodality enhancement and intermodality interaction (IAEIEI) block to embed some additional information beyond the domain distribution by expanding the cross-modal reconstruction space. Representative multimodal multiscale features can be extracted, while to some extent improving their generalization to the target domain (TD). Then, a multimodal multiscale feature fusion (MMFF) block is built to model the spatial and scale dependencies for feature fusion and reduce the layer-by-layer transmission of redundancy or interference information. Finally, a dual-discriminator-based DA (DDA) block is designed for class-specific semantic features and global structural alignments in both spatial and prediction spaces. It will enable HAMAE to model the cross-modal, cross-scale, and cross-domain associations, yielding more representative domain-invariant multimodal fusion features. Extensive experiments on five cross-domain MSRS datasets verify the superiority of the proposed HMDA-DANet over other state-of-the-art methods.

Original languageEnglish
Article number4409917
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume63
DOIs
Publication statusPublished - 2025

Keywords

  • Cross-scene
  • end-to-end classification
  • frequency attention
  • masked autoencoder (MAE)
  • multiadversarial learning
  • multimodal masking learning
  • multisource remote sensing (MSRS) data

Fingerprint

Dive into the research topics of 'Unsupervised Domain Adaptation With Hierarchical Masked Dual-Adversarial Network for End-to-End Classification of Multisource Remote Sensing Data'. Together they form a unique fingerprint.

Cite this