TY - JOUR
T1 - High-Throughput and Energy-Efficient FPGA-Based Accelerator for All Adder Neural Networks
AU - Zhang, Ning
AU - Ni, Shuo
AU - Chen, Liang
AU - Wang, Tong
AU - Chen, He
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2025
Y1 - 2025
N2 - Neural networks have been extensively applied across various Internet of Things (IoT) applications, such as drone- and satellite-based remote sensing and autonomous driving. With the increasing resolution and amount of data captured by sensors, the demand for real-time response in IoT applications is markedly increasing. However, it is difficult for existing convolutional neural network (CNN) accelerators for IoT applications on field-programmable gate array (FPGA) platforms to achieve high throughput because of the inherent dense multiplication operations of CNNs, memory bandwidth limitations and inefficient mapping mechanisms. In this article, a high-throughput and energy-efficient all adder neural network (A2NN) accelerator for IoT applications on FPGA platform is proposed to solve this problem. First, a series of hardware-oriented algorithm optimization methods are proposed to simplify the processing flow of A2NN and further minimize its deployment overhead. Second, a novel hardware architecture based on the idea of near-memory computation (NMC) is proposed to eliminate off-chip memory access completely and accelerate the reconstructed A2NN in the pipeline. Third, a set of quantitative analysis methods for the proposed accelerator is presented to balance throughput and energy consumption, allowing the accelerator to adapt to the varying demands of different IoT application scenarios. Extensive experimental results on the AMD-Xilinx VC709 board demonstrate that the proposed accelerator achieves state-of-the-art performance in terms of throughput, energy efficiency, and throughput efficiency. Moreover, experiments on the AMD-Xilinx KV260 board highlight the architecture’s exceptional scalability and energy efficiency, enabling a balance between speed and power consumption tailored to the specific requirements of IoT application scenarios.
AB - Neural networks have been extensively applied across various Internet of Things (IoT) applications, such as drone- and satellite-based remote sensing and autonomous driving. With the increasing resolution and amount of data captured by sensors, the demand for real-time response in IoT applications is markedly increasing. However, it is difficult for existing convolutional neural network (CNN) accelerators for IoT applications on field-programmable gate array (FPGA) platforms to achieve high throughput because of the inherent dense multiplication operations of CNNs, memory bandwidth limitations and inefficient mapping mechanisms. In this article, a high-throughput and energy-efficient all adder neural network (A2NN) accelerator for IoT applications on FPGA platform is proposed to solve this problem. First, a series of hardware-oriented algorithm optimization methods are proposed to simplify the processing flow of A2NN and further minimize its deployment overhead. Second, a novel hardware architecture based on the idea of near-memory computation (NMC) is proposed to eliminate off-chip memory access completely and accelerate the reconstructed A2NN in the pipeline. Third, a set of quantitative analysis methods for the proposed accelerator is presented to balance throughput and energy consumption, allowing the accelerator to adapt to the varying demands of different IoT application scenarios. Extensive experimental results on the AMD-Xilinx VC709 board demonstrate that the proposed accelerator achieves state-of-the-art performance in terms of throughput, energy efficiency, and throughput efficiency. Moreover, experiments on the AMD-Xilinx KV260 board highlight the architecture’s exceptional scalability and energy efficiency, enabling a balance between speed and power consumption tailored to the specific requirements of IoT application scenarios.
KW - All adder neural network
KW - accelerator
KW - energy efficient
KW - field-programmable gate array (FPGA)
KW - high throughput
KW - low power
UR - http://www.scopus.com/inward/record.url?scp=85218769929&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2025.3543213
DO - 10.1109/JIOT.2025.3543213
M3 - Article
AN - SCOPUS:85218769929
SN - 2327-4662
VL - 12
SP - 20357
EP - 20376
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 12
ER -