LegoDNN: Block-grained scaling of deep neural networks for mobile vision

Rui Han; Qinglong Zhang; Chi Harold Liu; Guoren Wang; Jian Tang; Lydia Y. Chen

doi:10.1145/3447993.3483249

LegoDNN: Block-grained scaling of deep neural networks for mobile vision

Rui Han, Qinglong Zhang, Chi Harold Liu, Guoren Wang, Jian Tang, Lydia Y. Chen

计算机学院

科研成果: 会议稿件 › 论文 › 同行评审

28 引用（Scopus）

摘要

Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-Time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-The-Art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.

源语言	英语
页	406-419
页数	14
DOI	https://doi.org/10.1145/3447993.3483249
出版状态	已出版 - 2021
活动	27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021 - New Orleans, 美国期限: 25 10月 2021 → 29 10月 2021

会议

会议	27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021
国家/地区	美国
市	New Orleans
时期	25/10/21 → 29/10/21

访问文件

10.1145/3447993.3483249

其它文件与链接

链接到 Scopus 的出版物

引用此

@conference{b73c8990c6b64420bcd49c9692827dc7,

title = "LegoDNN: Block-grained scaling of deep neural networks for mobile vision",

abstract = "Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-Time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-The-Art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.",

keywords = "block-grained scaling, mobile vision, neural networks",

author = "Rui Han and Qinglong Zhang and Liu, {Chi Harold} and Guoren Wang and Jian Tang and Chen, {Lydia Y.}",

note = "Publisher Copyright: {\textcopyright} 2021 ACM.; 27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021 ; Conference date: 25-10-2021 Through 29-10-2021",

year = "2021",

doi = "10.1145/3447993.3483249",

language = "English",

pages = "406--419",

}

TY - CONF

T1 - LegoDNN

T2 - 27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021

AU - Han, Rui

AU - Zhang, Qinglong

AU - Liu, Chi Harold

AU - Wang, Guoren

AU - Tang, Jian

AU - Chen, Lydia Y.

PY - 2021

Y1 - 2021

N2 - Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-Time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-The-Art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.

AB - Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-Time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-The-Art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.

KW - block-grained scaling

KW - mobile vision

KW - neural networks

UR - http://www.scopus.com/inward/record.url?scp=85133003483&partnerID=8YFLogxK

U2 - 10.1145/3447993.3483249

DO - 10.1145/3447993.3483249

M3 - Paper

AN - SCOPUS:85133003483

SP - 406

EP - 419

Y2 - 25 October 2021 through 29 October 2021

ER -

LegoDNN: Block-grained scaling of deep neural networks for mobile vision

摘要

会议

访问文件

其它文件与链接

指纹

引用此