Abstract

This paper presents methods for generating communication on compiling HPF programs for distributed-memory machines. We introduce the concept of an iteration template corresponding to an iteration space. Our HPF compiler performs the loop iteration mapping through the two-level mapping of the iteration template in the same way as the data mapping is performed in HPF. Making use of this unified mapping model of the data and the loops, communication for nonlocal accesses is handled based on data-realignment between the user-declared alignment and the optimal alignment, which ensures that only local accesses occur inside the loop. This strategy results in effective means of dealing with communication for arrays with undefined mapping, a simple manner for generating communication, and high portability of the HPF compiler. Experimental results on the NEC Cenju-3 distributed-memory machine demonstrate the effectiveness of our approach: the execution time of the compiler-generated program was within 10% of that of the hand-parallelized program.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call