Deep Convolution Neural Networks (DCNNs) are widely used in real-time applications, including image classification, speech recognition, and object detection. However, there are challenges for real-time applications on portable devices like mobile phones or embedded systems. Most object detection models are optimized for desktop configurations, requiring fast GPUs. Shallower networks with fewer computational complexities have been proposed for real-time detection, but ultimately compromise detection accuracy. Performance and complexity trade-offs are significant for computationally complex deep networks. This work aims to implement the CNN-based object detection model Tiny-Yolo-v2 on a Field Programmable Gate Array (FPGA) using Register Transfer Logic (RTL) as a native language. Hardware implementation is synthesized on The AMD Virtex 7 FPGA VC709 Connectivity Kit using VHDL code on Vivado 2020.1. This is the first, up to the authors' knowledge, RTL implementation of the Tiny-Yolo-v2 object identification algorithm on FPGA. The power consumed by the CNN layers equals 7.09W at a frequency of 100MHz.
Read full abstract