Skip to main navigation Skip to search Skip to main content

A Multimodal Remote Sensing Image Registration Framework with Dual-Stream Multiscale Attention and Adaptive Deformation Refinement

  • Yunan He
  • , Chenxuan Yang
  • , Ce Sun*
  • , Ping Song
  • *Corresponding author for this work
  • Beijing Institute of Technology
  • CAS - Xi'an Institute of Optics and Precision Mechanics

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal remote sensing image registration is a fundamental step for multisource data fusion and geospatial analysis. However, nonlinear radiometric, textural, and geometric discrepancies across modalities challenge intensity- and traditional feature-based methods, hindering both accuracy and robustness. Under extreme appearance variations, these methods often suffer from local mismatches and can introduce geometric distortions. To address the aforementioned issues, this paper proposes DS-MAR, a multimodal image registration framework based on dual-stream multiscale attention and adaptive deformation field refinement. First, we propose a dual-stream multiscale dynamic attention network that extracts hierarchical features from multimodal remote sensing images via an independent dual-branch architecture and leverages deformable cross-attention to achieve semantic-level dynamic alignment across modalities, thereby producing highly accurate initial deformation field estimates. Next, we introduce an adaptive deformation field refinement module that leverages confidence-aware residual learning to rectify low-confidence regions while maintaining consistency in high-confidence areas. Then, to suppress non-physical distortions in the deformation field, we design a composite loss that integrates second-order smoothness constraints with a Jacobian determinant penalty. Finally, we adopt a multi-stage training strategy to enable synergistic optimization from feature extraction to deformation refinement. Experimental results show that our method substantially outperforms existing state-of-the-art approaches in registration accuracy and robustness on multiple public datasets, especially for image pairs exhibiting extreme appearance variations.

Keywords

  • deep learning
  • Deformation field refinement
  • dual-stream multiscale attention
  • image registration
  • multimodal remote sensing images

Fingerprint

Dive into the research topics of 'A Multimodal Remote Sensing Image Registration Framework with Dual-Stream Multiscale Attention and Adaptive Deformation Refinement'. Together they form a unique fingerprint.

Cite this