RaNAS: Resource-Aware Neural Architecture Search for Edge Computing

  • Jianhua Gao
  • , Zeming Liu
  • , Yizhuo Wang
  • , Weixing Ji*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Neural architecture search (NAS) for edge devices is often time-consuming because of long-latency deploying and testing on edge devices. The ability to accurately predict the computation cost and memory requirement for convolutional neural networks (CNNs) in advance holds substantial value. Existing work primarily relies on analytical models, which can result in high prediction errors. This article proposes a resource-aware NAS (RaNAS) model based on various features. Additionally, a new graph neural network is introduced to predict inference latency and maximum memory requirements for CNNs on edge devices. Experimental results show that, within the error bound of ±1%, RaNAS achieves an accuracy improvement of approximately 8% for inference latency prediction and about 25% for maximum memory occupancy prediction over the state-of-the-art approaches.

Original languageEnglish
Article number18
JournalTransactions on Architecture and Code Optimization
Volume22
Issue number1
DOIs
Publication statusPublished - 20 Mar 2025
Externally publishedYes

Keywords

  • Additional Key Words and PhrasesNeural architecture search
  • computational resource
  • edge devices
  • graph neural network

Fingerprint

Dive into the research topics of 'RaNAS: Resource-Aware Neural Architecture Search for Edge Computing'. Together they form a unique fingerprint.

Cite this