TY - GEN
T1 - Enhancing Multi-Class Object Recognition with MultiCA-YOLOv7
T2 - 2023 IEEE 6th International Conference on Information Systems and Computer Aided Education, ICISCAE 2023
AU - Wang, Nan
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Computer vision and object detection techniques have achieved significant success across various domains. However, challenges posed by multi-class and complex multi-object scenarios often remain overlooked in model predictions. The diversity of target categories, coupled with factors like lighting variations, varying angles, and mutual occlusions, present challenges for achieving high-level recognition accuracy in existing methods. Addressing these challenges, this paper introduces MultiCA-YOLOv7, based on a multi-point embedded Coordinate Attention (CA) mechanism. The powerful feature extraction capability of the network backbone and the dual-channel and spatial attention enhancement of the CA module result in more accurate features. In this framework, MultiCA-YOLOv7 achieves an impressive 96.15% mAP50 on the test set without introducing additional parameters, outperforming the best-performing baseline model by 6.85%. Across six different categories in the test dataset, all models undergo rigorous evaluation. The experimental results indicate that the proposed model outperforms baseline models in most evaluation metrics. We analyzed the reasons for the ineffective performance of the CA attention mechanism based on YOLOv8. Furthermore, in the prediction of multi-level object detection, the comprehensively trained model successfully captures all moving targets without omission, validating the proposed model's higher accuracy and robustness.
AB - Computer vision and object detection techniques have achieved significant success across various domains. However, challenges posed by multi-class and complex multi-object scenarios often remain overlooked in model predictions. The diversity of target categories, coupled with factors like lighting variations, varying angles, and mutual occlusions, present challenges for achieving high-level recognition accuracy in existing methods. Addressing these challenges, this paper introduces MultiCA-YOLOv7, based on a multi-point embedded Coordinate Attention (CA) mechanism. The powerful feature extraction capability of the network backbone and the dual-channel and spatial attention enhancement of the CA module result in more accurate features. In this framework, MultiCA-YOLOv7 achieves an impressive 96.15% mAP50 on the test set without introducing additional parameters, outperforming the best-performing baseline model by 6.85%. Across six different categories in the test dataset, all models undergo rigorous evaluation. The experimental results indicate that the proposed model outperforms baseline models in most evaluation metrics. We analyzed the reasons for the ineffective performance of the CA attention mechanism based on YOLOv8. Furthermore, in the prediction of multi-level object detection, the comprehensively trained model successfully captures all moving targets without omission, validating the proposed model's higher accuracy and robustness.
KW - Multi-object classification
KW - MultiCA-YOLOv7
KW - Object detection
UR - http://www.scopus.com/inward/record.url?scp=85187254424&partnerID=8YFLogxK
U2 - 10.1109/ICISCAE59047.2023.10393697
DO - 10.1109/ICISCAE59047.2023.10393697
M3 - Conference contribution
AN - SCOPUS:85187254424
T3 - 2023 IEEE 6th International Conference on Information Systems and Computer Aided Education, ICISCAE 2023
SP - 608
EP - 612
BT - 2023 IEEE 6th International Conference on Information Systems and Computer Aided Education, ICISCAE 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 September 2023 through 25 September 2023
ER -