Abstract
Edge Intelligence (EI) offers an attractive approach for local AI processing at the network edge for privacy protection and reduced transmission, but deploying resource-intensive neural networks on edge devices remains a challenge. The neural architecture search (NAS) technique, known for its automation and minimal manual intervention, serves as a pivotal tool for EI. However, existing methods typically concentrate on optimizing resource consumption for specific hardware, leading to hardware-specific neural architectures with limited generalizability. In response, we propose OnceNAS, a novel method that designs and optimizes on-device inference neural networks for resource-constrained edge devices. OnceNAS simultaneously optimizes for parameter count and inference latency in addition to inference accuracy, producing lightweight neural networks while maintaining their inference performance. Meanwhile, we introduce an efficient evaluation strategy that can simultaneously assess multiple metrics. Experimental results demonstrate the effectiveness of OnceNAS, achieving high-performing architectures with substantial size reduction (10.49x) and speedup (5.45x). As a result, OnceNAS offers practical value by generating efficient on-device inference neural architectures for resource-constrained edge devices, facilitating real-world applications like autonomous driving and smart healthcare. Furthermore, we contribute DARTS-Bench, an open-source dataset providing candidate architectures with hardware-related information and a user-friendly API, facilitating future research in lightweight NAS.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.