Abstract

Convolutional Neural Networks (CNNs) have been widely used in Artificial Intelligence applications due to their unsupervised feature, which automatically identifies relevant features without human intervention. Outperforming power-hungry GPUs and inflexible ASICs in lightweight CNNs, FPGAs serve as a promising platform on balancing peak performance, energy efficiency and flexibility. In the last decade, several frameworks have been proposed to optimize the global performance of CNN on hardware platforms. This paper presents a survey on hardware architectures generated by various software frameworks designed for mapping CNN on FPGAs. Classic architectural cases of the streaming architecture and the single computation engine from traditional CNN-specific processors to end-to-end mapping using High Level Synthesis (HLS) tools which emerged in recent years are carefully analyzed in a sequential order. Moreover, adaptability of existing frameworks to upcoming challenges and future directions of FPGA-based CNN accelerators are identified, providing an in-depth evaluation on the topic of hardware architectures of FPGA-based CNN accelerators.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.