Address-event representation (AER) is an emergent hardware technology which shows a high potential for providing in the near future a solid technological substrate for emulating brain-like processing structures. When used for vision, AER sensors and processors are not restricted to capturing and processing still image frames, as in commercial frame-based video technology, but sense and process visual information in a pixel-level event-based frameless manner. As a result, vision processing is practically simultaneous to vision sensing, since there is no need to wait for sensing full frames. Also, only meaningful information is sensed, communicated, and processed. Of special interest for brain-like vision processing are some already reported AER convolutional chips, which have revealed a very high computational throughput as well as the possibility of assembling large convolutional neural networks in a modular fashion. It is expected that in a near future we may witness the appearance of large scale convolutional neural networks with hundreds or thousands of individual modules. In the meantime, some research is needed to investigate how to assemble and configure such large scale convolutional networks for specific applications. In this paper, we analyze AER spiking convolutional neural networks for texture recognition hardware applications. Based on the performance figures of already available individual AER convolution chips, we emulate large scale networks using a custom made event-based behavioral simulator. We have developed a new event-based processing architecture that emulates with AER hardware Manjunath's frame-based feature recognition software algorithm, and have analyzed its performance using our behavioral simulator. Recognition rate performance is not degraded. However, regarding speed, we show that recognition can be achieved before an equivalent frame is fully sensed and transmitted.
Read full abstract