A 1.2GFLOPS neural network chip exhibiting fast convergence

Y Kondo,H Mori,Y Koshiba,H Shinohara,M Murasaki,H Amishiro,T Yamada,Y Arima

doi:10.1109/isscc.1994.344663

Abstract

This digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD) architecture and includes twelve 24 b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24 b/spl times/1.28 kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24 b/spl times/1 kw instruction code memory (ICM) and a 144 b/spl times/256 w microcode memory (MCM), broadcasts network parameters (e.g. learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768 kW using the two ports. >

Full Text