Abstract

In recent years, microelectronics technology has entered the era of nanoelectronics/integrated microsystems. System in package (SiP) and system on chip (SoC) are two important technical approaches for the realization of microsystems. Deep learning technology based on neural networks is used in graphics and images. Computer vision and target recognition are widely used. The deep learning technology of convolutional neural network is an important research field in the miniaturization and miniaturization of embedded platforms. How to combine the lightweight neural network with the microsystem to achieve the optimal balance of performance, size, and power consumption is a difficult point. This article introduces a micro-system implementation scheme that combines SiP technology and FPGA-based convolutional neural network. It uses Zynq SoC and FLASH and DDR3 memory as the main components, and uses SiP high-density system packaging technology to integrate. PL end (FPGA) design Convolutional Neural Network, convolutional neural network accelerator, adopt the method of convolution multi-dimensional division and cyclic block to design the accelerator structure, design multiple multiplication and addition parallel computing units to provide the computing power of the system. Improving and accelerating perform on the YOLOv2_Tiny model. The test uses the COCO data set as the training and test samples. The microsystem can accurately identify the target. The volume is only 30 × 30 × 1.2 mm. The performance reaches 22.09GOPs and the power consumption is only 0.81 W under the working frequency of 150 MHz. Multi-objective balance (performance, size and power consumption) of lightweight neural network Microsystems has realized.

Highlights

  • With the development of deep learning (Deep Learning) technology, Convolutional Neural Network (CNN) has been widely used in machine vision fields such as target detection and face recognition in recent years

  • Lu et al [11] introduced the convolution calculation method based on the Winograd algorithm into the hardware deployment of convolutional neural networks to reduce the number of multiplications required for convolution, so that the same number of DSP modules can be deployed with higher throughput Accelerator

  • The design in [24] uses the YOLOv2-Tiny network to map all layers to the FPGA, but does not use the ping-pong buffer, which results in the memory access and data transmission delays that cannot overlap with the calculation delays

Read more

Summary

Introduction

With the development of deep learning (Deep Learning) technology, Convolutional Neural Network (CNN) has been widely used in machine vision fields such as target detection and face recognition in recent years. Its integration method is more flexible in terms of R&D cycle and cost It has advantages and makes up for the shortcomings of SOC. In order to meet the requirements of target detection or terminal guidance in aviation field (such as UAV), a microsystem chip is needed to meet the following characteristics: a. It has the minimum system resources such as processor, FPGA and memory. SIP microsystem integrated packaging technology is combined with convolutional neural network to construct a dynamic and reconfigurable deep learning CNN microsystem. The XC7Z020 chip is integrated and packaged with DDR3, Flash memory, etc It reduces the volume and power consumption, and improves the signal integrity.

SiP technology status
Neural network model
Neural network accelerator
SIP chip
Convolution module
Pooling module
Roofline module
Functional testing
Model parameters
Test results
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call