Abstract

Convolutional neural network (CNN) has a wide range of applications in face detection and recognition, image classification and semantic segmentation, but it is very difficult to deploy CNN on FPGA embedded platform. The Deep Learning Processor Unit (DPU) released by Xilinx is different from the previous deployment of FPGA, which can accelerate the realization of CNN deployment on FPGA platform and supports a variety of classical CNN structures. In this paper, the face and landmark detection CNN is deployed on ZCU102 platform using DPU based on idea of hardware and software co-design. According to the network features supported by DPU, Normalize network features in VGG-SSD were adjusted to BatchNormalize network features, Convolution was added in LeNet and a double-layer convolution structure was adopted, and the model was pruned to reduce resource consumption and computation. Dual-core DPU and deep flow architecture were used to improve data throughput. The experimental results show that the average detection time of single frame video image face and landmark detection is 26ms, and this design improves the acceleration effect significantly, and has good scalability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.