Abstract

This paper proposed an energy-efficient deep convolution neural networks coprocessor (DCNNs-CP) architecture for multi-object detection applications based on deep learning algorithms. The DCNNs-CP can support both convolutional layers and fully connected layers to accelerate various mobile deep learning algorithms. It also supports maximum and mean pooling operations through a separate pooling module structure. Besides, a reconfigurable activation function module supporting four nonlinear functions is also realized in this coprocessor. The DCNNs-CP chip was implemented in 55 nm CMOS process technology and occupied the 4 mm2 die area. The DCNNs-CP supports 8-bit and 16-bit fixed-point data precision and achieves a peak performance of 3.4 Tops/W at 1.2 V supply voltage and a maximum frequency of 500 MHz, represent 2.13x improvements over reported hardware accelerators. Besides, the chip achieves 0.85 Tops/W · mm2 energy efficiency per area and 34.0 Tops/W · MB energy efficiency per memory (on-chip memory), making it suitable to be integrated with the mobile devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call