摘要
Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-Time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-The-Art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.
源语言 | 英语 |
---|---|
页 | 406-419 |
页数 | 14 |
DOI | |
出版状态 | 已出版 - 2021 |
活动 | 27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021 - New Orleans, 美国 期限: 25 10月 2021 → 29 10月 2021 |
会议
会议 | 27th ACM Annual International Conference On Mobile Computing And Networking, MobiCom 2021 |
---|---|
国家/地区 | 美国 |
市 | New Orleans |
时期 | 25/10/21 → 29/10/21 |