Abstract

In parallel image processing and numerical analysis, various matrix manipulation operations are used intensively. In the past decades, many parallel storage schemes, called skewing schemes, have been proposed to provide simultaneous access to various data patterns (slices of a matrix). The existing storage schemes have the following limitations: (1) The address generation mechanism is dependent on the size of the matrix to be processed, thus the system hardware must be changed to efficiently process different sized matrices. (2) Many schemes have limitations on the machine size and image size (N × N), such as N must be an even power of 2. (3) As more and more frequently used data patterns have been recognized, most schemes can only provide parallel access to a limited range of data patterns. (4) With existing routing techniques, the data alignment (connecting each memory module to a proper processor) may require special hardware. This paper proposes several storage schemes (EE, MG, EE-MG, and EE*MG). They employ only exclusive-or operations for address generation which can be completed in constant time. The address generation mechanism is independent of the matrix size so that different sized matrices can be processed efficiently on a fixed-size machine. The system uses N memory modules where N is any (even or odd) power of 2. These schemes cover more data patterns than any other scheme yet proposed. Patterns of N elements that can be accessed in one memory cycle include diagonals, blocks with various shapes, points scattered over various blocks, and chessboards with various shapes. Data alignment requirements can be easily realized on a general-purpose interconnection network, such as a hypercube or MINs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call