Abstract

This paper presents a hardware acceleration design for convolutional neural networks. Floating-point fixed-point operations, pipeline interlayer parallel acceleration, and design space exploration are the three key areas of optimization, and optimized modules can be used to build various networks with convolutions according to specifications for the application scenario, thus achieving a universal design. The experimental results show that the optimization of hardware resources improves the speed and performance of the algorithm, and can withstand larger data volumes and higher real-time requirements. The system achieves an accuracy of 95.09% and an inference speed of 0.237 ms per image, with a high processing speed. As a result, convolutional neural networks may now be used in a wider variety of application scenarios and manage larger datasets and higher real-time demands thanks to the design solutions presented in this research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call