Abstract

Neural network (NN) processors are specially designed to handle deep learning tasks by utilizing multilayer artificial NNs. They have been demonstrated to be useful in broad application fields such as image recognition, speech processing, machine translation, and scientific computing. Meanwhile, innovative self-aware techniques, whereby a system can dynamically react based on continuously sensed information from the execution environment, have attracted attention from both academia and industry. Actually, various self-aware techniques have been applied to NN systems to significantly improve the computational speed and energy efficiency. This article surveys state-of-the-art self-aware NN systems (SaNNSs), which can be achieved at different layers, that is, the architectural layer, the physical layer, and the circuit layer. At the architectural layer, SaNNS can be characterized from a data-centric perspective where different data properties (i.e., data value, data precision, dataflow, and data distribution) are exploited. At the physical layer, various parameters of physical implementation are considered. At the circuit layer, different logics and devices can be used for high efficiency. In fact, the self-awareness of existing SaNNS is still in a preliminary form. We propose a comprehensive SaNNS from a new perspective, that is, the model layer, to exploit more opportunities for high efficiency. The proposed system is called as MinMaxNN, which features model switching and elastic sparsity based on monitored information from the execution environment. The model switching mechanism implies that models (i.e., min and max model) dynamically switch given different inputs for both efficiency and accuracy. The elastic sparsity mechanism indicates that the sparsity of NNs can be dynamically adjusted in each layer for efficiency. The experimental results show that compared with traditional SaNNS, MinMaxNN can achieve 5.64× and 19.66% performance improvement and energy reduction, respectively, without notable loss of accuracy and negative effects on developers' productivity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call