Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators

Tian Chen; Yu an Tan; Zheng Zhang; Nan Luo; Bin Li; Yuanzhang Li

doi:10.1016/j.jpdc.2024.104869

Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators

Tian Chen, Yu an Tan, Zheng Zhang, Nan Luo, Bin Li, Yuanzhang Li^*

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

As convolution layers have been proved to be the most time-consuming operation in convolutional neural network (CNN) algorithms, many efficient CNN accelerators have been designed to boost the performance of convolution operations. Previous works on CNN acceleration usually use fixed design variables for diverse convolutional layers, which would lead to inefficient data movements and low utilization of computing resource. We tackle this issue by proposing a flexible dataflow optimization method with design variables estimation for different layers. The optimization method first narrows the design space by the priori constraints, and then enumerates all legal solutions to select the optimal design variables. We demonstrate the effectiveness of the proposed optimization method by implementing representative CNN models (VGG-16, ResNet-18 and MobileNet V1) on Enflame Technology's programmable CNN accelerator, General Computing Unit (GCU). The results indicate that our optimization can significantly enhance the throughput of the convolution layers in ResNet, VGG and MobileNet on GCU, with improvement of up to 1.84×. Furthermore, it achieves up to 2.08× of GCU utilization specifically for the convolution layers of ResNet on GCU.

Original language	English
Article number	104869
Journal	Journal of Parallel and Distributed Computing
Volume	189
DOIs	https://doi.org/10.1016/j.jpdc.2024.104869
Publication status	Published - Jul 2024

Keywords

Convolutional neural networks (CNNs)
General computing unit (GCU)
Optimization
Programmable dataflow

Access to Document

10.1016/j.jpdc.2024.104869

Cite this

@article{e2fcaacdc7d8402a9744f658751c25cc,

title = "Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators",

abstract = "As convolution layers have been proved to be the most time-consuming operation in convolutional neural network (CNN) algorithms, many efficient CNN accelerators have been designed to boost the performance of convolution operations. Previous works on CNN acceleration usually use fixed design variables for diverse convolutional layers, which would lead to inefficient data movements and low utilization of computing resource. We tackle this issue by proposing a flexible dataflow optimization method with design variables estimation for different layers. The optimization method first narrows the design space by the priori constraints, and then enumerates all legal solutions to select the optimal design variables. We demonstrate the effectiveness of the proposed optimization method by implementing representative CNN models (VGG-16, ResNet-18 and MobileNet V1) on Enflame Technology's programmable CNN accelerator, General Computing Unit (GCU). The results indicate that our optimization can significantly enhance the throughput of the convolution layers in ResNet, VGG and MobileNet on GCU, with improvement of up to 1.84×. Furthermore, it achieves up to 2.08× of GCU utilization specifically for the convolution layers of ResNet on GCU.",

keywords = "Convolutional neural networks (CNNs), General computing unit (GCU), Optimization, Programmable dataflow",

author = "Tian Chen and Tan, {Yu an} and Zheng Zhang and Nan Luo and Bin Li and Yuanzhang Li",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Inc.",

year = "2024",

month = jul,

doi = "10.1016/j.jpdc.2024.104869",

language = "English",

volume = "189",

journal = "Journal of Parallel and Distributed Computing",

issn = "0743-7315",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators

AU - Chen, Tian

AU - Tan, Yu an

AU - Zhang, Zheng

AU - Luo, Nan

AU - Li, Bin

AU - Li, Yuanzhang

PY - 2024/7

Y1 - 2024/7

N2 - As convolution layers have been proved to be the most time-consuming operation in convolutional neural network (CNN) algorithms, many efficient CNN accelerators have been designed to boost the performance of convolution operations. Previous works on CNN acceleration usually use fixed design variables for diverse convolutional layers, which would lead to inefficient data movements and low utilization of computing resource. We tackle this issue by proposing a flexible dataflow optimization method with design variables estimation for different layers. The optimization method first narrows the design space by the priori constraints, and then enumerates all legal solutions to select the optimal design variables. We demonstrate the effectiveness of the proposed optimization method by implementing representative CNN models (VGG-16, ResNet-18 and MobileNet V1) on Enflame Technology's programmable CNN accelerator, General Computing Unit (GCU). The results indicate that our optimization can significantly enhance the throughput of the convolution layers in ResNet, VGG and MobileNet on GCU, with improvement of up to 1.84×. Furthermore, it achieves up to 2.08× of GCU utilization specifically for the convolution layers of ResNet on GCU.

AB - As convolution layers have been proved to be the most time-consuming operation in convolutional neural network (CNN) algorithms, many efficient CNN accelerators have been designed to boost the performance of convolution operations. Previous works on CNN acceleration usually use fixed design variables for diverse convolutional layers, which would lead to inefficient data movements and low utilization of computing resource. We tackle this issue by proposing a flexible dataflow optimization method with design variables estimation for different layers. The optimization method first narrows the design space by the priori constraints, and then enumerates all legal solutions to select the optimal design variables. We demonstrate the effectiveness of the proposed optimization method by implementing representative CNN models (VGG-16, ResNet-18 and MobileNet V1) on Enflame Technology's programmable CNN accelerator, General Computing Unit (GCU). The results indicate that our optimization can significantly enhance the throughput of the convolution layers in ResNet, VGG and MobileNet on GCU, with improvement of up to 1.84×. Furthermore, it achieves up to 2.08× of GCU utilization specifically for the convolution layers of ResNet on GCU.

KW - Convolutional neural networks (CNNs)

KW - General computing unit (GCU)

KW - Optimization

KW - Programmable dataflow

UR - http://www.scopus.com/inward/record.url?scp=85187196876&partnerID=8YFLogxK

U2 - 10.1016/j.jpdc.2024.104869

DO - 10.1016/j.jpdc.2024.104869

M3 - Article

AN - SCOPUS:85187196876

SN - 0743-7315

VL - 189

JO - Journal of Parallel and Distributed Computing

JF - Journal of Parallel and Distributed Computing

M1 - 104869

ER -

Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this