Abstract

The deep learning has become the key for artificial intelligence applications development. It was successfully used to solve computer vision tasks. But the deep learning algorithms are based on Deep Neural Networks (DNN) with many hidden layers which need a huge computation effort and a big storage space. Thus, the general-purpose graphical processing units (GPGPU) are the best candidate for DNN development and inference because of their high number of processing core and the big integrated memory. In the other side, the disadvantage of the GPGPU is high-power consumption. In a real-world application, the processing unit is an embedded system based on limited power and computation resources. In recent years, Field Programmable Gate Array (FPGA) becomes a serious solution that can outperform GPGPU because of their flexible architecture and low power consumption. The FPGA is equipped with a very small integrated memory and a low bandwidth. To make DNNs fit into FPGA we need a lot of optimization techniques at different levels such as the network level, the hardware level, and the implementation tools level. In this paper, we will cite the existing optimization techniques and evaluate them to provide a complete overview of FPGA based DNN accelerators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call