Abstract

In a programmable (multistage) cellular neural network (CNN) structure, the CPU is a CNN universal chip which supports massively parallel computations on patterns and images, including videos. In this paper, we decompose the structure of a class of simultaneous recurrent networks (SRN) into a CNN program and run it on a von Neumann-like stored program CNN structure. To train the SRN, we map the back-propagation-through-time (BTT) learning algorithm into a sequence of CNN subroutines to achieve real-time performance via a CNN universal chip. By computing in parallel, the CNN universal chip can be programmed to implement in real time the BTT learning algorithm, which has a very high time complexity. An estimate of the time complexity of the BTT learning algorithm based on the CNN universal chip is presented. For small-scale problems, our simulation results show that a CNN implementation of the BTT learning algorithm for a two-dimensional SRN is at least 10,000 times faster than that based on state-of-the-art sequential workstations. For the few large-scale problems which we have so far simulated, the CNN implemented BTT learning algorithm maintained virtually the same time complexity with a learning time of a few seconds, while those implemented on state-of-the-art sequential workstations dramatically increased their time complexity, often requiring several days of running time. Several examples are presented to demonstrate how efficiently a CNN universal chip can speed up the learning algorithm for both off-line and on-line applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call