Abstract
For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF)-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.
Highlights
In most existing parallel programming systems, each parallel program or job is assigned a fixed number of processors in a dedicated mode
We have developed our run-time support for adaptive parallelism on top of Y1ultiblock PARTI because this run-time library provides much of the run-time support required for forallloops and array expressions in data-parallel languages like high-performance Fortran (HPF)
This run-time library can be used in optimizing communication and partitioning work for HPF codes in which data distribution, loop bounds, and/or strides are unknown at compile-time and indirection arrays are not used
Summary
In most existing parallel programming systems, each parallel program or job is assigned a fixed number of processors in a dedicated mode. To the best of our knowledge, all existing work on compiling data-parallel applications assumes that the number of processors available for execution does not vary at run-time [4,5,6]. 2. Handling work distribution and communication detection, insertion, and optimization when the number of processors on which a given parallel loop will be executed is not known at compile-time. Includes routines for handling the two tasks we have described This run-time library can be used by compilers for data-parallellanguages or it can be used by a programmer parallelizing an application by hand. Our experimental results show that if the number of available processors does not vary frequently, the cost of redistributing data is not significant as compared to the total execution time of the program.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have