TY - GEN
T1 - Collaborative detection of infrared targets in the overlapping region of four-aperture fields of view based on the residual local pyramid attention network
AU - Zhao, Siyuan
AU - Luo, Lin
AU - Jin, Weiqi
N1 - Publisher Copyright:
© COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.
PY - 2025/10/28
Y1 - 2025/10/28
N2 - In this paper, aiming at the requirement of model lightweighting due to the excessively large input size of four-aperture infrared images, a Residual Local Pyramid Attention Network (RLPANet) for infrared small target detection is proposed based on the Dense Nested Attention Network. The number of parameters and computational complexity of model inference are reduced through the multiplexing and fusion of global-local attention and the segmentation of feature maps. Meanwhile, the detection rate (Pd) and intersection over union (IoU) levels are maintained through the optimization of residual connections and the redesign of the Weighted Dice-BCE (WDB) loss function that refers to the human eye imaging mode. During the detection stage, the internal and external parameter matrices after the calibration of the four-aperture camera are used to establish the connection of overlapping field-of-view images. After obtaining the target center coordinates of the sub-field of view, the target center positions of the remaining fields of view are calculated through reprojection. The full-pixel detection of the sub-field of view is transformed into a 50×50 detection window, thus greatly reducing the image size required for target detection. This method solves the problem of slow model inference speed caused by the excessively large size of four-aperture infrared images, and verifies the time efficiency and stability of the lightweight module on multiple infrared small target datasets and image sequences collected from actual outdoor scenes.
AB - In this paper, aiming at the requirement of model lightweighting due to the excessively large input size of four-aperture infrared images, a Residual Local Pyramid Attention Network (RLPANet) for infrared small target detection is proposed based on the Dense Nested Attention Network. The number of parameters and computational complexity of model inference are reduced through the multiplexing and fusion of global-local attention and the segmentation of feature maps. Meanwhile, the detection rate (Pd) and intersection over union (IoU) levels are maintained through the optimization of residual connections and the redesign of the Weighted Dice-BCE (WDB) loss function that refers to the human eye imaging mode. During the detection stage, the internal and external parameter matrices after the calibration of the four-aperture camera are used to establish the connection of overlapping field-of-view images. After obtaining the target center coordinates of the sub-field of view, the target center positions of the remaining fields of view are calculated through reprojection. The full-pixel detection of the sub-field of view is transformed into a 50×50 detection window, thus greatly reducing the image size required for target detection. This method solves the problem of slow model inference speed caused by the excessively large size of four-aperture infrared images, and verifies the time efficiency and stability of the lightweight module on multiple infrared small target datasets and image sequences collected from actual outdoor scenes.
KW - Attention Network Structure
KW - Four-aperture Bionic Compound Eye
KW - Infrared Small Target Detection
KW - Residual Local Pyramid Module
UR - https://www.scopus.com/pages/publications/105025784173
U2 - 10.1117/12.3076438
DO - 10.1117/12.3076438
M3 - Conference contribution
AN - SCOPUS:105025784173
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - AOPC 2025
A2 - Li, Xue
A2 - Tang, Xin
PB - SPIE
T2 - AOPC 2025: Infrared and Terahertz Technology and Applications
Y2 - 24 June 2025 through 27 June 2025
ER -