TY - GEN
T1 - A Reconfigurable Pipelined Architecture for Convolutional Neural Network Acceleration
AU - Xue, Chengbo
AU - Cao, Shan
AU - Jiang, Rongkun
AU - Yang, Hao
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/4/26
Y1 - 2018/4/26
N2 - The convolutional neural network (CNN) has become widely used in a variety of vision recognition applications, and the hardware acceleration of CNN is in urgent need as increasingly more computations are required in the state-of-the-art CNN networks. In this paper, we propose a pipelined architecture for CNN acceleration. The probability of both inner-layer and inter-layer pipeline for typical CNN networks is analyzed. And two types of data re-ordering methods, the filter-first (FF) flow and the image-first (IF) flow, are proposed for different kinds of layers. Then, a pipelined CNN accelerator for AlexNet is implemented, the dataflow of which can be reconfigurably selected for different layer processing. Simulation results show that the proposed pipelined architecture achieves 43% performance improvement compared with the non-pipelined ones. The AlexNet accelerator is implemented in 65nm CMOS technology working at 200MHz, with 350mW power consumption and 24GFLOPS peak performance.
AB - The convolutional neural network (CNN) has become widely used in a variety of vision recognition applications, and the hardware acceleration of CNN is in urgent need as increasingly more computations are required in the state-of-the-art CNN networks. In this paper, we propose a pipelined architecture for CNN acceleration. The probability of both inner-layer and inter-layer pipeline for typical CNN networks is analyzed. And two types of data re-ordering methods, the filter-first (FF) flow and the image-first (IF) flow, are proposed for different kinds of layers. Then, a pipelined CNN accelerator for AlexNet is implemented, the dataflow of which can be reconfigurably selected for different layer processing. Simulation results show that the proposed pipelined architecture achieves 43% performance improvement compared with the non-pipelined ones. The AlexNet accelerator is implemented in 65nm CMOS technology working at 200MHz, with 350mW power consumption and 24GFLOPS peak performance.
KW - Convolutional neural network
KW - hardware accelerator
KW - inter-layer pipeline
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85057086424&partnerID=8YFLogxK
U2 - 10.1109/ISCAS.2018.8351425
DO - 10.1109/ISCAS.2018.8351425
M3 - Conference contribution
AN - SCOPUS:85057086424
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
BT - 2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018
Y2 - 27 May 2018 through 30 May 2018
ER -