Abstract

Understanding the performance of parallel applications for prevalent clusters of commodity PCs is still not an easy task: One must understand performance characteristics of all subsystems in the cluster machine, besides the inherently required knowledge about the applications' behaviour. While there are already many benchmarks that characterise a single node's subsystems like CPU, memory and I/O. as well as a few to evaluate its network interface with point-to-point data streams, there are to the best of our knowledge no benchmarks available that characterise a cluster network or LAN as a whole. We present Switchbench (2005), a set of three microbenchmarks that thoroughly evaluate the throughput characteristics of networks for clusters. A first microbenchmark tests the basic processing limitations of the switches, by sending and receiving data concurrently at maximum throughputs on all network interfaces. A second microbenchmark tests arbitrary communication patterns by pairwise connecting nodes for high-speed throughput tests. A third and slightly more realistic microbenchmark executes an all-to-all personalised communication (AAPC) algorithm to test many different patterns and critical bisections in the network. The microbenchmarks already proved to be extremely useful in a previous study to experimentally quantify performance limitations in different networks of clusters of PCs with up to 128 nodes. We also establish the suitability of our microbenchmarks by comparing their results with two application benchmarks. The benchmarks consist of two C programs supported by shell scripts to start the programs on all nodes of the cluster with the correct execution parameters to automatically scale the workloads from a few nodes up to the full cluster size.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call