Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection

Ruiheng Zhang; Lu Li; Qi Zhang; Jin Zhang; Lixin Xu; Baomin Zhang; Binglu Wang

doi:10.1109/TCSVT.2023.3289142

Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection

Ruiheng Zhang, Lu Li, Qi Zhang, Jin Zhang, Lixin Xu^*, Baomin Zhang^*, Binglu Wang

^*Corresponding author for this work

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

59 Citations (Scopus)

Abstract

The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources and reach superior results on various practical tasks, such as detection and segmentation, over that of a single modality. However, most existing dual-modality object detection algorithms ignore the modal differences and fail to consider the correlation between feature extraction and fusion, which leads to incomplete extraction and inadequate fusion of dual-modality features. Hence, there raises an issue of how to preserve each unique modal feature and fully utilize the complementary infrared and visible information. Facing the above challenges, we propose a novel Differential Feature Awareness Network (DFANet) within antagonistic learning for infrared and visible object detection. The proposed model consists of an Antagonistic Feature Extraction with Divergence (AFED) module used to extract the differential infrared and visible features with unique information, and an Attention-based Differential Feature Fusion (ADFF) module used to fully fuse the extracted differential features. We conduct performance comparisons with existing state-of-the-art models on two benchmark datasets to represent the robustness and superiority of DFANet, and numerous ablation experiments to illustrate its effectiveness.

Original language	English
Pages (from-to)	6735-6748
Number of pages	14
Journal	IEEE Transactions on Circuits and Systems for Video Technology
Volume	34
Issue number	8
DOIs	https://doi.org/10.1109/TCSVT.2023.3289142
Publication status	Published - 2024

Keywords

Infrared-visible object detection
multi-modal feature fusion

Access to Document

10.1109/TCSVT.2023.3289142

Cite this

Zhang, R., Li, L., Zhang, Q., Zhang, J., Xu, L., Zhang, B., & Wang, B. (2024). Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection. IEEE Transactions on Circuits and Systems for Video Technology, 34(8), 6735-6748. https://doi.org/10.1109/TCSVT.2023.3289142

@article{fe52867e269c4a0c81b3d33d2a8643fb,

title = "Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection",

abstract = "The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources and reach superior results on various practical tasks, such as detection and segmentation, over that of a single modality. However, most existing dual-modality object detection algorithms ignore the modal differences and fail to consider the correlation between feature extraction and fusion, which leads to incomplete extraction and inadequate fusion of dual-modality features. Hence, there raises an issue of how to preserve each unique modal feature and fully utilize the complementary infrared and visible information. Facing the above challenges, we propose a novel Differential Feature Awareness Network (DFANet) within antagonistic learning for infrared and visible object detection. The proposed model consists of an Antagonistic Feature Extraction with Divergence (AFED) module used to extract the differential infrared and visible features with unique information, and an Attention-based Differential Feature Fusion (ADFF) module used to fully fuse the extracted differential features. We conduct performance comparisons with existing state-of-the-art models on two benchmark datasets to represent the robustness and superiority of DFANet, and numerous ablation experiments to illustrate its effectiveness.",

keywords = "Infrared-visible object detection, multi-modal feature fusion",

author = "Ruiheng Zhang and Lu Li and Qi Zhang and Jin Zhang and Lixin Xu and Baomin Zhang and Binglu Wang",

note = "Publisher Copyright: {\textcopyright} 1991-2012 IEEE.",

year = "2024",

doi = "10.1109/TCSVT.2023.3289142",

language = "English",

volume = "34",

pages = "6735--6748",

journal = "IEEE Transactions on Circuits and Systems for Video Technology",

issn = "1051-8215",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "8",

}

TY - JOUR

T1 - Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection

AU - Zhang, Ruiheng

AU - Li, Lu

AU - Zhang, Qi

AU - Zhang, Jin

AU - Xu, Lixin

AU - Zhang, Baomin

AU - Wang, Binglu

PY - 2024

Y1 - 2024

N2 - The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources and reach superior results on various practical tasks, such as detection and segmentation, over that of a single modality. However, most existing dual-modality object detection algorithms ignore the modal differences and fail to consider the correlation between feature extraction and fusion, which leads to incomplete extraction and inadequate fusion of dual-modality features. Hence, there raises an issue of how to preserve each unique modal feature and fully utilize the complementary infrared and visible information. Facing the above challenges, we propose a novel Differential Feature Awareness Network (DFANet) within antagonistic learning for infrared and visible object detection. The proposed model consists of an Antagonistic Feature Extraction with Divergence (AFED) module used to extract the differential infrared and visible features with unique information, and an Attention-based Differential Feature Fusion (ADFF) module used to fully fuse the extracted differential features. We conduct performance comparisons with existing state-of-the-art models on two benchmark datasets to represent the robustness and superiority of DFANet, and numerous ablation experiments to illustrate its effectiveness.

AB - The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources and reach superior results on various practical tasks, such as detection and segmentation, over that of a single modality. However, most existing dual-modality object detection algorithms ignore the modal differences and fail to consider the correlation between feature extraction and fusion, which leads to incomplete extraction and inadequate fusion of dual-modality features. Hence, there raises an issue of how to preserve each unique modal feature and fully utilize the complementary infrared and visible information. Facing the above challenges, we propose a novel Differential Feature Awareness Network (DFANet) within antagonistic learning for infrared and visible object detection. The proposed model consists of an Antagonistic Feature Extraction with Divergence (AFED) module used to extract the differential infrared and visible features with unique information, and an Attention-based Differential Feature Fusion (ADFF) module used to fully fuse the extracted differential features. We conduct performance comparisons with existing state-of-the-art models on two benchmark datasets to represent the robustness and superiority of DFANet, and numerous ablation experiments to illustrate its effectiveness.

KW - Infrared-visible object detection

KW - multi-modal feature fusion

UR - http://www.scopus.com/inward/record.url?scp=85163779927&partnerID=8YFLogxK

U2 - 10.1109/TCSVT.2023.3289142

DO - 10.1109/TCSVT.2023.3289142

M3 - Article

AN - SCOPUS:85163779927

SN - 1051-8215

VL - 34

SP - 6735

EP - 6748

JO - IEEE Transactions on Circuits and Systems for Video Technology

JF - IEEE Transactions on Circuits and Systems for Video Technology

IS - 8

ER -

Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this