Abstract

In recent years, Deep neural networks (DNNs) accelerator designs receive much attention due to the dramatically increasing demands of real-time applications. Because of the different computing requirements according to various applications, the involved convolution kernel sizes in the target DNN model are not fixed, which increases the difficulty of the DNN accelerator design. In a practical way, the processing element (PE) array is usually adopted to implement the DNN accelerator based on the largest kernel size in the target DNN model. However, this kind of worst-case design consideration leads to low hardware utilization. Besides, the dedicated array-based PE interconnection restricts the efficiency, bringing from the data-reuse computing method and increases the memory access. The reason is that it is difficult to design a proper dataflow to support multiple data-reuse methods in the PE array-based processor. To address the aforementioned problems, we propose weight-wise convolution processing mechanism and employ Network-on-Chip (NoC) interconnection in this work. The proposed mechanism can support high applicability for arbitrary kernel size. Besides, the NoC-based DNN design provides elastic dataflow, which leverages the data reuse for arbitrary kernel sizes in the target DNN model. Compared with the related works, the experimental results show that the proposed approach can improve 2% to 21% average utilization of computational capability of a PE and reduce 25% to 60% memory access, which further save 20% to 56% energy consumption.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call