APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators

Paniti Achararit,Yuko Hara-Azumi,Muhammad Abdullah Hanif,Rachmad Vidya Wicaksana Putra,Muhammad Shafique

doi:10.1109/access.2020.3022327

Abstract

Designing resource-efficient deep neural networks (DNNs) is a challenging task due to the enormous diversity of applications as well as their time-consuming design, training, optimization, and evaluation cycles, especially the resource-constrained embedded systems. To address these challenges, we propose a novel DNN design framework called accuracy-and-performance-aware neural architecture search (APNAS), which can generate DNNs efficiently, as it does not require hardware devices or simulators while searching for optimized DNN model configurations that offer both inference accuracy and high execution performance. In addition, to accelerate the process of DNN generation, APNAS is built on a weight sharing and reinforcement learning-based exploration methodology, which is composed of a recurrent neural network controller as its core to generate sample DNN configurations. The reward in reinforcement learning is formulated as a configurable function to consider the sample DNNs' accuracy and cycle count required to run on a target hardware architecture. To further expedite the DNN generation process, we devise analytical models for cycle count estimation instead of running millions of DNN configurations on real hardware. We demonstrate that these analytical models are highly accurate and provide cycle count estimates identical to those of a cycle-accurate hardware simulator. Experiments that involve quantitatively varying hardware constraints demonstrate that APNAS requires only 0.55 graphics processing unit (GPU) days on a single Nvidia GTX 1080Ti GPU to generate DNNs that offer an average of 53% fewer cycles with negligible accuracy degradation (on average 3%) for image classification compared to state-of-the-art techniques.

Highlights

The accuracy offered by state-of-the-art deep neural networks (DNNs) has led to their use in a variety of artificial intelligence applications, including object detection, speech recognition, event detection, machine translation, and autonomous driving [1]–[10]
We demonstrated that and-performance-aware NAS (APNAS) successfully generated convolutional neural networks (CNNs) that were aware of both validation accuracy and computational complexity
Our work focuses on a weight-stationary dataflowbased neural processing array (NPA), such as the massively-parallel neural array (MPNA) accelerator [27], which is composed of a smaller-scale NPA suitable for embedded systems

Summary

INTRODUCTION

The accuracy offered by state-of-the-art deep neural networks (DNNs) has led to their use in a variety of artificial intelligence applications, including object detection, speech recognition, event detection, machine translation, and autonomous driving [1]–[10]. Several recent studies on NAS considered both accuracy and computation amount (or execution time) [21]–[23], they required time-consuming evaluation by a hardware device or simulator, resulting in a very long exploration time, which can significantly delay design and deployment cycles. We propose a novel NAS framework called accuracy-and-performance-aware neural architecture search (APNAS), which considers both accuracy and performance to automatically generate DNNs suitable for neural hardware accelerators in resource-constrained embedded systems without the need for hardware devices or time-consuming simulators. We propose an analytical model that provides an abstract yet accurate estimation of the performance (i.e., cycle count) required for the inference of a CNN on neural processing array-based hardware.

PRELIMINARIES

CYCLE COUNT ESTIMATION FOR CNN

Findings

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 25	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Lightening the Load with Highly Accurate Storage- and Energy-Efficient LightNNs
Ruizhou Ding ... Diana Marculescu
ACM Transactions on Reconfigurable Technology and Systems | VOL. 11
Ruizhou Ding, et. al.Ruizhou Ding ... Diana Marculescu
30 Sep 2018
ACM Transactions on Reconfigurable Technology and Systems | VOL. 11

LightNN
Ruizhou Ding ... Rongye Shi
-
Ruizhou Ding, et. al.Ruizhou Ding ... Rongye Shi
10 May 2017
10 May 2017

RoHNAS: A Neural Architecture Search Framework With Conjoint Optimization for Adversarial Robustness and Hardware Efficiency of Convolutional and Capsule Networks
Alberto Marchisio ... Maurizio Martina
IEEE Access | VOL. 10
Alberto Marchisio, et. al.Alberto Marchisio ... Maurizio Martina
01 Jan 2021
IEEE Access | VOL. 10

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access