Zhang, Yusen, Qin, Yunchuan, Zhang, Yufeng, Zhou, Xu, Jian, Songlei, Tan, Yusong, and Li, Kenli
Edge Intelligence (EI) offers an attractive approach for local AI processing at the network edge for privacy protection and reduced transmission, but deploying resource-intensive neural networks on edge devices remains a challenge. The neural architecture search (NAS) technique, known for its automation and minimal manual intervention, serves as a pivotal tool for EI. However, existing methods typically concentrate on optimizing resource consumption for specific hardware, leading to hardware-specific neural architectures with limited generalizability. In response, we propose OnceNAS, a novel method that designs and optimizes on-device inference neural networks for resource-constrained edge devices. OnceNAS simultaneously optimizes for parameter count and inference latency in addition to inference accuracy, producing lightweight neural networks while maintaining their inference performance. Meanwhile, we introduce an efficient evaluation strategy that can simultaneously assess multiple metrics. Experimental results demonstrate the effectiveness of OnceNAS, achieving high-performing architectures with substantial size reduction (10.49x) and speedup (5.45x). As a result, OnceNAS offers practical value by generating efficient on-device inference neural architectures for resource-constrained edge devices, facilitating real-world applications like autonomous driving and smart healthcare. Furthermore, we contribute DARTS-Bench, an open-source dataset providing candidate architectures with hardware-related information and a user-friendly API, facilitating future research in lightweight NAS. • Carefully Selected Optimization Objectives: We choose hardware-agnostic objectives to control the model complexity. • Efficient Neural Architecture Search Strategy: We introduce a novel NAS method for resource-constrained edge devices. • Accurate Architecture Performance Evaluation: We propose an efficient neural architecture performance evaluation method. • Dataset Contribution: We contribute DARTS-Bench, including corresponding runtime information and API functions. [ABSTRACT FROM AUTHOR]