TY - GEN
T1 - H2D-Net
T2 - 2025 International Joint Conference on Neural Networks, IJCNN 2025
AU - Zhang, Zekai
AU - Fan, Xiangpan
AU - Zhou, Shichao
AU - Wang, Wenzheng
AU - Cui, Dongshun
AU - Wang, Shuigen
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Following detection-by-segmentation paradigm, U-net and its variants have recently achieved competitive performance in infrared small target detection (IRSTD) benchmarks. However, when the targets only occupy few pixels, the U-shape deep network tends to favor global background patterns over local appearance of targets in the feature encoding stage, and indiscriminately amplifies false feature response in the decoder. Such representation bias and error accumulation degrade identification capability when target-similar distractors occur. Here, by introducing high-resolution cues, we advocate our High-resolution Guided Hierarchical Discriminative Network (H2D-Net), where High Resolution Guidance (HRG) module and Holistic Distractor Filter (HDF) module are devised to tackle the aforementioned issues. Specifically, an extra hierarchical network with fixed scale embedding, i.e., high-resolution cues, is parallelly assigned to rectify the representation bias of the U-shape network via a group of the HRG modules, which facilitate bidirectional interaction between the fine-grained spatial details and multiscale representations. Furthermore, the refining HDF module is embedded into the bottleneck between the encoder and decoder for the purpose of interrupting feedforward propagation of the false feature response. Extensive experiments demonstrate that the H2D-Net significantly enhances the detection performance of infrared small targets, particularly in reducing false alarms, outperforming state-of-the-art methods across multiple real-world infrared datasets.
AB - Following detection-by-segmentation paradigm, U-net and its variants have recently achieved competitive performance in infrared small target detection (IRSTD) benchmarks. However, when the targets only occupy few pixels, the U-shape deep network tends to favor global background patterns over local appearance of targets in the feature encoding stage, and indiscriminately amplifies false feature response in the decoder. Such representation bias and error accumulation degrade identification capability when target-similar distractors occur. Here, by introducing high-resolution cues, we advocate our High-resolution Guided Hierarchical Discriminative Network (H2D-Net), where High Resolution Guidance (HRG) module and Holistic Distractor Filter (HDF) module are devised to tackle the aforementioned issues. Specifically, an extra hierarchical network with fixed scale embedding, i.e., high-resolution cues, is parallelly assigned to rectify the representation bias of the U-shape network via a group of the HRG modules, which facilitate bidirectional interaction between the fine-grained spatial details and multiscale representations. Furthermore, the refining HDF module is embedded into the bottleneck between the encoder and decoder for the purpose of interrupting feedforward propagation of the false feature response. Extensive experiments demonstrate that the H2D-Net significantly enhances the detection performance of infrared small targets, particularly in reducing false alarms, outperforming state-of-the-art methods across multiple real-world infrared datasets.
KW - attention-induced feature fusion
KW - dense connection
KW - infrared small target detection
KW - Multi-scale representation
UR - https://www.scopus.com/pages/publications/105023984733
U2 - 10.1109/IJCNN64981.2025.11228271
DO - 10.1109/IJCNN64981.2025.11228271
M3 - Conference contribution
AN - SCOPUS:105023984733
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - International Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 30 June 2025 through 5 July 2025
ER -