TY - JOUR
T1 - Deep-IRTarget
T2 - An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation
AU - Zhang, Ruiheng
AU - Xu, Lixin
AU - Yu, Zhengyu
AU - Shi, Ye
AU - Mu, Chengpo
AU - Xu, Min
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.
AB - Recently, convolutional neural networks (CNNs) have brought impressive improvements for object detection. However, detecting targets in infrared images still remains challenging, because the poor texture information, low resolution and high noise levels of the thermal imagery restrict the feature extraction ability of CNNs. In order to deal with these difficulties in the feature extraction, we propose a novel backbone network named Deep-IRTarget, composing of a frequency feature extractor, a spatial feature extractor and a dual-domain feature resource allocation model. Hypercomplex Infrared Fourier Transform is developed to calculate the infrared intensity saliency by designing hypercomplex representations in the frequency domain, while a convolutional neural network is invoked to extract feature maps in the spatial domain. Features from the frequency domain and spatial domain are stacked to construct Dual-domain features. To efficiently integrate and recalibrate them, we propose a Resource Allocation model for Features (RAF). The well-designed channel attention block and position attention block are used in RAF to respectively extract interdependent relationships among channel and position dimensions, and capture channel-wise and position-wise contextual information. Extensive experiments are conducted on three challenging infrared imagery databases. We achieve 10.14%, 9.1% and 8.05% improvement on mAP scores, compared to the current state of the art method on MWIR, BITIR and WCIR respectively.
KW - Convolutional neural networks
KW - feature extraction
KW - infrared imagery
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85104196422&partnerID=8YFLogxK
U2 - 10.1109/TMM.2021.3070138
DO - 10.1109/TMM.2021.3070138
M3 - Article
AN - SCOPUS:85104196422
SN - 1520-9210
VL - 24
SP - 1735
EP - 1749
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -