Abstract

The conventional Deep Neural Networks (DNNs) accelerator is usually designed based on array-based processing element computing. By using the array-based computation, memory access can be reduced efficiently based on a specific dataflow. However, the computing flexibility of the contemporary DNN accelerators is usually restricted because they support the monotonous computing dataflow. Hence, the computing efficiency will be degraded under the different hyperparameters of the target DNN (e.g., kernel size, layers, etc.) Because of the high flexibility and scalability of Network on Chip (NoC) interconnection, the NoC-based DNN design methodology becomes an attractive design paradigm, which reduces the design complexity of the DNN accelerator implementation. However, the current NoC-based DNN designs usually assume that the entire DNN model can be mapped to the target NoC. In this way, the area overhead will be larger with respect to the increasing DNN scale. To solve this problem, we propose a dynamic mapping algorithm, called: dense mapping. The dense mapping is used to map the neuron operations to the NoC as long as the available computing resources are enough for the neuron operations. Besides, an input sharing mechanism is proposed to reuse input data. In this way, we can not only process a DNN model on a small-scale NoC but also decrease the number of memory access by using the proposed input sharing mechanism. Compared with the related work, the proposed approaches help to reduce 60.5% number of memory access and improve 96.5% throughput due to fewer memory accesses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call