Abstract

Publisher Summary Small to medium-sized clusters of workstations with up to four processors per node are very popular due to their performance-price ratio. This chapter explains how different implementations of the message passing interface (MPI) standard using different interconnection networks influence the runtime of parallel applications on a cluster of symmetric multiprocessors (SMP) workstations. Two different benchmark suites are used: the special Karlsruher MPI (SKaMPI) benchmark suite for measurements of MPI communication operations, and the NAS parallel benchmark suite for application specific measurements. The runtime of MPI applications is influenced by two facts: the performance of the interconnection network used, and the implementation of point-to-point and collective communication operations. Both MPI implementations using fast ethernet deliver comparable results with a small advantage in favor of LAM, if many small messages are exchanged. ScaMPI delivers best results for almost all benchmark programs. MP-MPICH, the freely available SCI MPI implementation shows good results for all benchmark programs, but is currently limited to 16 MPI processes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call