We propose that clusters interconnected with network topologies having minimal mean path length will increase their overall performance for a variety of applications. We approach our heuristic by constructing clusters of up to 36 nodes having Dragonfly, torus, ring, Chvatal, Wagner, Bidiakis and several other topologies with minimal mean path lengths and by simulating the performance of 256-node clusters with the same network topologies. The optimal (or sub-optimal) low-latency network topologies are found by minimizing the mean path length of regular graphs. The selected topologies are benchmarked using ping-pong messaging, the MPI collective communications, and the standard parallel applications including effective bandwidth, FFTE, Graph 500 and NAS parallel benchmarks. We established strong correlations between the clusters' performances and the network topologies, especially the mean path lengths, for a wide range of applications. In communication-intensive benchmarks, clusters with optimal network topologies out-perform those with mainstream topologies by several folds. It is striking that a mere adjustment of the network topology suffices to reclaim performance from the same computing hardware.
Read full abstract