This paper analyzes and studies the hardware programmable logic resources on small-scale FPGA chips, providing reasonable hardware resource support for subsequent neural network accelerator designs. A flexible 32-bit instruction set is designed for control by the Processing System (PS) on the Programmable Logic (PL) side, making motion state detection flexible and controllable. When designing the hardware side, this paper uses a resource-sharing strategy, and most of the calculation modules are designed using on-chip DSP resources to reduce the resource consumption of the calculation module. An innovative strategy of partially not caching the data between layers of the neural network is applied to reduce the demand for on-chip cache. To optimize on-chip storage space, this article partitions the limited BRAM space on the chip in a reasonable manner and improves the efficiency of on-chip data reading and writing through parallel processing, thereby improving the real-time performance of the neural network.
Read full abstract