Abstract

In this paper, we give a generalized solution to the problem of conflict-free access of various templates of data of a matrix, when they are stored in memory units in a parallel processor. The important features of our method are: (a) compact representation of a skewing scheme, (b) simple address computation, (c) use of self-routing schemes to set up the interconnection network, and (d) a general framework for the study of skewing schemes. In our method, each template access of interest will be a linear permutation on the processor address. The linear permutation involved determines the types of templates accessible. For parallel access of the most important templates, namely, row, column, main diagonal, and square blocks, the interconnection network needs to realize only the class of linear-complement permutations. It is known that with Beneš or Omega as the interconnection network, one can efficiently self-route these permutations; this compares favorably with the schemes proposed by other researchers who assume that a crossbar is available for processor-memory interconnections. Hence, the approach given in the paper can be used to solve the data alignment problem for the existing parallel machines such as IBM RP3, Cedar multiprocessor, and NYU Ultracomputer. This is a generalized solution to the data skewing problem and encompasses the previous efforts by other researchers as special cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call