TY - GEN
T1 - One-Stage Object Detector Using Feature Fusion and Dual Attention
AU - Zhang, Di
AU - Zhang, Weimin
AU - Li, Fangxing
AU - Liang, Kaiwen
AU - Yang, Yuhang
AU - Zhong, Zhide
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Multi-scale object detection and small object detection are important tasks in computer vision and robotics. Traditional one-stage object detection algorithms are fast, but cannot adapt to changes in object scale perfectly and have low accuracy for small object detection. In this paper, we proposed a one-stage object detector using feature fusion and attentional mechanisms. This detector has better adaptability and higher accuracy with marginal extra cost. Our algorithm is based on the CenterNet framework, with modifications and enhancements. First, we used a feature pyramid network with bidirectional cross-scale connections to enable the full integration of lower layers and topmost features, which improved the network's adaptability to scale changes. Second, we proposed a dual attention module that integrates spatial attention and channel attention and added it to every layer of the pyramid, so that the network can focus more on useful information, which enhances the network's capability to detect small objects. Using a lightweight backbone, our algorithm achieves 40.2 mAP on COCO at 33 fps on a Titan Xp GPU, up to 3 mAP compared to CenterNet with the same backbone. Besides, for the small targets in COCO, our algorithm achieves 20.63 mAP, Which is 30% higher than CenterNet. We also tested our algorithm on a body part detection dataset of Pascal, Voc,which further demonstrates the effectiveness of our algorithm in terms of application.
AB - Multi-scale object detection and small object detection are important tasks in computer vision and robotics. Traditional one-stage object detection algorithms are fast, but cannot adapt to changes in object scale perfectly and have low accuracy for small object detection. In this paper, we proposed a one-stage object detector using feature fusion and attentional mechanisms. This detector has better adaptability and higher accuracy with marginal extra cost. Our algorithm is based on the CenterNet framework, with modifications and enhancements. First, we used a feature pyramid network with bidirectional cross-scale connections to enable the full integration of lower layers and topmost features, which improved the network's adaptability to scale changes. Second, we proposed a dual attention module that integrates spatial attention and channel attention and added it to every layer of the pyramid, so that the network can focus more on useful information, which enhances the network's capability to detect small objects. Using a lightweight backbone, our algorithm achieves 40.2 mAP on COCO at 33 fps on a Titan Xp GPU, up to 3 mAP compared to CenterNet with the same backbone. Besides, for the small targets in COCO, our algorithm achieves 20.63 mAP, Which is 30% higher than CenterNet. We also tested our algorithm on a body part detection dataset of Pascal, Voc,which further demonstrates the effectiveness of our algorithm in terms of application.
UR - http://www.scopus.com/inward/record.url?scp=85171579890&partnerID=8YFLogxK
U2 - 10.1109/ICARM58088.2023.10218821
DO - 10.1109/ICARM58088.2023.10218821
M3 - Conference contribution
AN - SCOPUS:85171579890
T3 - 2023 8th IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2023
SP - 373
EP - 379
BT - 2023 8th IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2023
Y2 - 8 July 2023 through 10 July 2023
ER -