Abstract

Edge devices offer advantages such as low computation latency and high data security for executing convolutional neural networks (CNNs). However, deploying CNNs on resource-constrained devices is challenging due to the high computational intensity of CNNs and limited hardware on-chip resources. This hinders the application of deep learning techniques on edge devices. To address this issue, this paper proposes a reconfigurable CNN edge computing system based on Field-Programmable Gate Array (FPGA) for target detection tasks. The system utilizes the pipeline structure of FPGAs to speed up network computation and employs off-chip memory to store network models. Thus, the need for tiling techniques with high intermediate cache requirements was circumvented in this system. Additionally, we developed a parallel data scheduling model to reduce storage access cost delay on network computation efficiency. Our model achieves comparable efficiency with on-chip storage-based works when using off-chip storage. Online reconfigurable design enables the system to configure network structure and parameters at runtime to achieve different target recognition. This provides greater flexibility for use cases with frequently changing requirements. The proposed system was implemented on a Spartan-6 XC6SLX150 FPGA platform and was applied to pedestrian and vehicle classification tasks. To evaluate performance, speed, power consumption, and average intersection over union (IoU) were separately measured. Our system achieved a detection speed of 16 frames per second (FPS) on the Spartan-6, with a power consumption rate of 0.79 W and an average IoU of 41.2%. Remarkably, the system demonstrated a 178% speed increase and a 60% power consumption reduction compared to the FPGA-based FitNN implementation, while classification accuracy was reduced by only 2.52%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call