Concurrent MAC unit design using VHDL for deep learning networks on FPGA

Hossam O Ahmed,Maged Ghoneima,Mohamed Dessouky

doi:10.1109/iscaie.2018.8405440

Hossam O Ahmed, Maged Ghoneima + Show 1 more

https://doi.org/10.1109/iscaie.2018.8405440

Copy DOI

Export

Save

Cite

Publication Date: Apr 1, 2018

Citations: 13

Affiliation: Ain Shams University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Deep neural network algorithms have proven their enormous capabilities in wide range of artificial intelligence applications, specially in Printed/Handwritten text recognition, Multimedia processing, Robotics and many other high end technological trends. The most challenging aspect nowadays is to overcome the extremely computational processing demands in applying such algorithms, especially in real-time systems. Recently, the Field Programmable Gate Array (FPGA) has been considered as one of the optimum hardware accelerator platform for accelerating the deep neural network architectures due to its large adaptability and the high degree of parallelism it offers. In this paper, the proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture aimed to create a fully-customize MAC unit for the Convolutional Neural Networks (CNN) instead of depending on the conventional DSP blocks and embedded memories units on the FPGAs architecture silicon fabrics. The proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture is designed using VHDL language and can performs a computational speed up to 4.17 Giga Operation per Second (GOPS) using high-density FPGAs.

Full Text