Abstract
Neural architecture search (NAS) for edge devices is often time-consuming because of long-latency deploying and testing on edge devices. The ability to accurately predict the computation cost and memory requirement for convolutional neural networks (CNNs) in advance holds substantial value. Existing work primarily relies on analytical models, which can result in high prediction errors. This article proposes a resource-aware NAS (RaNAS) model based on various features. Additionally, a new graph neural network is introduced to predict inference latency and maximum memory requirements for CNNs on edge devices. Experimental results show that, within the error bound of ±1%, RaNAS achieves an accuracy improvement of approximately 8% for inference latency prediction and about 25% for maximum memory occupancy prediction over the state-of-the-art approaches.
| Original language | English |
|---|---|
| Article number | 18 |
| Journal | Transactions on Architecture and Code Optimization |
| Volume | 22 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 20 Mar 2025 |
| Externally published | Yes |
Keywords
- Additional Key Words and PhrasesNeural architecture search
- computational resource
- edge devices
- graph neural network