Abstract

A Convolutional Neural Network (CNN) is a class of deep feed-forward artificial neural network usually employed to analyze visual images. Recently, rapid progress in applications based on CNNs has urged research on efficient architectures and implementations that exploit the latest technological advancements. The growing complexity of CNN architectures and the usage of reconfigurable devices enlarge the design space to a range hard to fully explore. This paper proposes a new method to design and implement efficient and flexible CNN architectures on hardware. The method adopts an N-fold approach, particularly suitable for devices with strict restrictions on power consumption and featuring reconfigurability. An 8-layer CNN to classify handwritten was trained on a software and prototyped on a FPGA available in a Zynq 7000 Programmable System on a Chip (SoC) board using the proposed CNN architecture. The hardware architecture was described and implemented using High Level Synthesis, enabling a fast development and easy configuration. Experimental results show that the best performance is achieved by using a pipelined design with a partial unfolding sublayer. A processing time of 3.4ms to classify was achieved with 41.16% of resources and 2.1W of power consumption. In contrast, a low-power design «2W), consuming 27.99% of the resources of the board, required 16.7ms to process the same CNN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call