Abstract

Integrated Processors (IP) are algorithm-specific cores that either by programming or by configuration can be re-used within many microelectronic systems. This paper looks at Cellular Neural Networks (CNN) to become realized as IP. First current digital implementations are reviewed, and the memoryprocessor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interface. The resulting node is small and supports multi-level CNNdesigns, giving the systema 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements optimizes the system operation. In conventional digital CNN designs, the treatment of boundary nodes requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP.

Highlights

  • Over the past years, computer architecture has developed from general-purpose processing to provision of algorithm-specific support

  • The focus of that study is on the size and speed of the network interface (NI) that wraps any design part to become accessible through the network standard

  • A single register is used to hold the current packet before it is multiplied by corresponding template coefficient that resides in a local memory (BRAM)

Read more

Summary

Introduction

Computer architecture has developed from general-purpose processing to provision of algorithm-specific support. Three different types of nonlinear functions are frequently used [4]: threshold, hyperbolic tangent, and piecewise linear function Both analogue (mixed-signal) and digital realizations of a CNN have been published [7, 8]. The former have a larger network capacity and allow for handling images of sufficient size. Already 8 pairs of input and output values need to be communicated for the minimal 1neighbourhood, one for each neighboring node This is affordable in analogue architectures as each value is carried by a single wire only. We conclude the effect of such measures on the definition of a CNN as IP and see that we can prototype up to 4 k cells with 2% system overhead on a Xilinx Virtex-II 6000

CNN Architecture Spectrum
Effect of Slicing
Nodal Models
Wheeled Networks
Boundary Nodes
System Architecture
Findings
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call