Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection

Ruiheng Zhang, Lu Li, Qi Zhang, Jin Zhang, Lixin Xu*, Baomin Zhang*, Binglu Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

49 Citations (Scopus)

Abstract

The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources and reach superior results on various practical tasks, such as detection and segmentation, over that of a single modality. However, most existing dual-modality object detection algorithms ignore the modal differences and fail to consider the correlation between feature extraction and fusion, which leads to incomplete extraction and inadequate fusion of dual-modality features. Hence, there raises an issue of how to preserve each unique modal feature and fully utilize the complementary infrared and visible information. Facing the above challenges, we propose a novel Differential Feature Awareness Network (DFANet) within antagonistic learning for infrared and visible object detection. The proposed model consists of an Antagonistic Feature Extraction with Divergence (AFED) module used to extract the differential infrared and visible features with unique information, and an Attention-based Differential Feature Fusion (ADFF) module used to fully fuse the extracted differential features. We conduct performance comparisons with existing state-of-the-art models on two benchmark datasets to represent the robustness and superiority of DFANet, and numerous ablation experiments to illustrate its effectiveness.

Original languageEnglish
Pages (from-to)6735-6748
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number8
DOIs
Publication statusPublished - 2024

Keywords

  • Infrared-visible object detection
  • multi-modal feature fusion

Fingerprint

Dive into the research topics of 'Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection'. Together they form a unique fingerprint.

Cite this