Abstract
To solve the problem of limited hardware resources and sensitive power consumption in the application of neural network object detection system for edge computing devices, a YOLOv3-Tiny neural network object detection hardware acceleration system was proposed based on field programmable gate array (FPGA). The scale of YOLOv3-Tiny network was reduced by using network structure reorganization, inter layer fusion and dynamic numerical quantization. Based on channel parallel and weight resident hardware acceleration algorithm, tight pipeline processing flow and hardware operation unit reuse, the utilization efficiency of hardware resources was improved. The designed end-to-end object detection acceleration system was deployed on UltraScale+ XCZU9EG FPGA. The result shows that it can achieve 96.6 GOPS throughput, 17.3 FPS detection frame rate and 4.12 W power consumption. The hardware resource utilization efficiency is 0.32 GOPS/DSP and 2.68 GOPS/kLUT. Maintaining efficient and accurate object detection capability, the utilization efficiency of hardware resources is better than other existing YOLOv3-Tiny object detection hardware accelerators.
Translated title of the contribution | Efficient Hardware Acceleration System Design for End-to-End Object Detection Neural Network |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 1312-1320 |
Number of pages | 9 |
Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
Volume | 42 |
Issue number | 12 |
DOIs | |
Publication status | Published - Dec 2022 |