Abstract

An analysis is presented of the primary factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on distributed-memory, massively parallel computer systems. Several modifications to the original parallel AGCM code aimed at improving its numerical efficiency, load-balance and single-node code performance are discussed. The impact of these optimization strategies on the performance on two of the state-of-the-art parallel computers, the Intel Paragon and Cray T3D, is presented and analyzed. It is found that implementation of a load-balanced FFT algorithm results in a reduction in overall execution time of approximately 45% compared to the original convolution-based algorithm. Preliminary results of the application of a load-balancing scheme for the physics part of the AGCM code suggest that additional reductions in execution time of 10–15% can be achieved. Finally, several strategies for improving the single-node performance of the code are presented, and the results obtained thus far suggest that reductions in execution time in the range of 35–45% are possible. © 1998 John Wiley & Sons, Ltd.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call