Abstract
Deep Neural Networks (DNNs) have shown superior accuracy at the expense of high memory and computation requirements. Optimizing DNN models regarding energy and hardware resource requirements is extremely important for applications with resource-constrained embedded environments. Although using binary neural networks (BNNs), one of the recent promising approaches, significantly reduces the design’s complexity, accuracy degradation is inevitable when reducing the precision of parameters and output activations. To balance between implementation cost and accuracy, in addition to proposing specialized hardware accelerators for corresponding specific network models, most recent software binary neural networks have been optimized based on generalized metrics, such as FLOPs or MAC operation requirements. However, with the wide range of hardware available today, independently evaluating software network structures is not good enough to determine the final network model for typical devices. In this paper, an architecture search algorithm based on estimating the hardware performance at the design time is proposed to achieve the best binary neural network models for hardware implementation on target platforms. With the XNOR-net used as a base architecture and target platforms, including Field Programmable Gate Array (FPGA), Graphic Processing Unit (GPU), and Resistive Random Access Memory (RRAM), the proposed algorithm shows its efficiency by giving more accurate estimation for the hardware performance at the design time than FLOPs or MAC operations.
Highlights
In recent years, deep learning has demonstrated incredible performance in diverse research areas with different tasks, such as classification and detection [1,2]
In addition to deploying optimal accelerators and optimizing based on accuracy and generalized metrics, such as FLOPs, MAC operations are a popular method applied for most neural network models
Only evaluating on software level is not enough to select the optimal model for hardware implementation
Summary
Deep learning has demonstrated incredible performance in diverse research areas with different tasks, such as classification and detection [1,2]. To design a hardware platform-aware optimal BNN, we propose a new framework that can analyze and explore the ultimate software model based on the estimation of the hardware performance with optimal effort and an architecture search for the training period. We present a neural network search algorithm called Deepbit, which can explore optimal BNN models for target hardware platforms by using the binary search method and hardware cost estimation charts. Since the training is performed on GPU servers that are resourceful environments in terms of computing performance, power, and temperature, it is much favorable to increase the computation at the training time if it results in an efficient neural network for the targeted hardware.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.