Abstract

Artificial Neural Networks (ANNs) and image processing requires massively parallel computation of simple operator accompanied by heavy memory access. Thus, this type of operators naturally maps onto Single Instruction Multiple Data (SIMD) stream parallel processing with distributed memory. This paper proposes a high performance neural network processor whose function can be changed by programming. The proposed processor is based on the SIMD architecture that is optimized for neural network and image processing. The proposed processor supports 24 instructions, and consists of 16 Processing Units (PUs) per chip. Each PU includes 24-bit 2K-word Local Memory (LM) and a Processing Element (PE). The proposed architecture allows multichip expansion that minimizes chip-to-chip communication bottleneck. The proposed processor is verified with FPGA implementation and the functionality is verified with character recognition application.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call