Abstract
Multimodal remote sensing image registration is a fundamental step for multisource data fusion and geospatial analysis. However, nonlinear radiometric, textural, and geometric discrepancies across modalities challenge intensity- and traditional feature-based methods, hindering both accuracy and robustness. Under extreme appearance variations, these methods often suffer from local mismatches and can introduce geometric distortions. To address the aforementioned issues, this paper proposes DS-MAR, a multimodal image registration framework based on dual-stream multiscale attention and adaptive deformation field refinement. First, we propose a dual-stream multiscale dynamic attention network that extracts hierarchical features from multimodal remote sensing images via an independent dual-branch architecture and leverages deformable cross-attention to achieve semantic-level dynamic alignment across modalities, thereby producing highly accurate initial deformation field estimates. Next, we introduce an adaptive deformation field refinement module that leverages confidence-aware residual learning to rectify low-confidence regions while maintaining consistency in high-confidence areas. Then, to suppress non-physical distortions in the deformation field, we design a composite loss that integrates second-order smoothness constraints with a Jacobian determinant penalty. Finally, we adopt a multi-stage training strategy to enable synergistic optimization from feature extraction to deformation refinement. Experimental results show that our method substantially outperforms existing state-of-the-art approaches in registration accuracy and robustness on multiple public datasets, especially for image pairs exhibiting extreme appearance variations.
| Original language | English |
|---|---|
| Journal | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
Keywords
- deep learning
- Deformation field refinement
- dual-stream multiscale attention
- image registration
- multimodal remote sensing images
Fingerprint
Dive into the research topics of 'A Multimodal Remote Sensing Image Registration Framework with Dual-Stream Multiscale Attention and Adaptive Deformation Refinement'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver