Abstract

Deep Neural Networks (DNNs) have shown superiority in solving the problems of classification and recognition in recent years. However, DNN hardware implementation is challenging due to the high computational complexity and diverse dataflow in different DNN models. To mitigate this design challenge, a large body of research has focused on accelerating specific DNN models or layers and proposed dedicated designs. However, dedicated designs for specific DNN models or layers limit the design flexibility. In this work, we take advantage of the similarity among different DNN models and propose a novel Lego-based Deep Neural Network on a Chip (DNNoC) design methodology. We work on common neural computing units (e.g., multiply-accumulation and pooling) and create some neuron computing units called NeuLego processing elements (NeuLegoPEs). These NeuLegoPEs are then interconnected using a flexible Network-on-Chip (NoC), allowing to construct different DNN models. To support large-scale DNN models, we enhance the reusability of each NeuLegoPE by proposing a Lego placement method. The proposed design methodology allows leveraging different DNN model implementations, helping to reduce implementation cost and time-to-market. Compared with the conventional approaches, the proposed approach can improve the average throughput by 2,802% for given DNN models. Besides, the corresponding hardware is implemented to validate the proposed design methodology, showing on average 12,523% hardware efficiency improvement by considering the throughput and area overhead simultaneously.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call