Don't Be Too Clever: Routing BMMC Permutations on the MasPar MP-2

T H Cormen

doi:10.1007/s002240000088

Abstract

The authors implemented and measured several methods to perform BMMC permutations on the MasPar MP-2. Each method is coded in C and MPL and uses only the global router to transfer data among processors. Implementation in MPL limits the amount of control the programmer has over the global router. The methods used were a naive method; the PE method, which adapts the block BMMC algorithm for parallel disk systems to treat processor elements as independent devices; and the cluster method, which adapts the block BMMC algorithm to treat clusters of processor elements as independent devices. The authors also implemented variations of the first two methods to compute virtual-processor numbers in Gray-code order. Our results indicate that for random BMMC permutations, the best method overall is the naive method with the Gray-code variation. The PE methods are almost as good, especially when each processor element has several items to route. For some permutations, however, the naive method performs very poorly; the best method in these cases is the PE method with the Gray-code variant. Although one might expect the cluster method to be best, since it minimizes congestion at the global router ports, this method turns out to be the worst overall, and it is always much worse than the PE method. If it had been coded at a lower level than MPL, the cluster method may well have turned out to be the best in many cases.

Full Text