Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation

Ruiheng Zhang; Lixin Xu; Zhengyu Yu; Ye Shi; Chengpo Mu; Min Xu

doi:10.1109/TMM.2021.3070138

Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation

Ruiheng Zhang, Lixin Xu, Zhengyu Yu, Ye Shi, Chengpo Mu, Min Xu^*

^*此作品的通讯作者

机电学院

科研成果: 期刊稿件 › 文章 › 同行评审

95 引用（Scopus）

摘要

Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.

源语言	英语
页（从-至）	1735-1749
页数	15
期刊	IEEE Transactions on Multimedia
卷	24
DOI	https://doi.org/10.1109/TMM.2021.3070138
出版状态	已出版 - 2022

访问文件

10.1109/TMM.2021.3070138

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d64e9c1f71e54833aa4afafecf7dda3f,

title = "Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation",

abstract = "Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.",

keywords = "Convolutional neural networks, feature extraction, infrared imagery, object detection",

author = "Ruiheng Zhang and Lixin Xu and Zhengyu Yu and Ye Shi and Chengpo Mu and Min Xu",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2022",

doi = "10.1109/TMM.2021.3070138",

language = "English",

volume = "24",

pages = "1735--1749",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Deep-IRTarget

T2 - An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation

AU - Zhang, Ruiheng

AU - Xu, Lixin

AU - Yu, Zhengyu

AU - Shi, Ye

AU - Mu, Chengpo

AU - Xu, Min

PY - 2022

Y1 - 2022

N2 - Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.

AB - Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.

KW - Convolutional neural networks

KW - feature extraction

KW - infrared imagery

KW - object detection

UR - http://www.scopus.com/inward/record.url?scp=85104196422&partnerID=8YFLogxK

U2 - 10.1109/TMM.2021.3070138

DO - 10.1109/TMM.2021.3070138

M3 - Article

AN - SCOPUS:85104196422

SN - 1520-9210

VL - 24

SP - 1735

EP - 1749

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation

摘要

访问文件

其它文件与链接

指纹

引用此