Abstract

This paper considers the architecture of clusters and related message-passing (MP) software algorithms and their effect on performance (speedup and efficiency) of cluster computing (CC). We present new architectures for multi-segment Ethernet clusters and new MP algorithms which fit these architectures. The multiple segments (e.g. commodity hubs) connect commodity processor nodes so as to allow MP to be highly parallelized by avoiding network contention and collisions in many applications where the all-gather and other collective operations are central. We analyze all-gather in some detail, and present new network topologies and new MP algorithms to minimize latency. The new topologies are based on a design, called two-by-four nets (2×4 nets) , by Compbionics. An integrated MP software system, called Reduced Overhead Cluster Communication (ROCC), which embodies the MP algorithms is also described. In brief, 2×4 nets are networks of “supernodes”, called 2×4's, each having 4 processors on 2 segments and segments usually being Ethernet hubs. The supernodes are typically connected to form rings or tori of supernodes. We present actual test results and supporting analyses to demonstrate that 2×4 nets with the ROCC MP software are faster than many existing clusters and generally less costly.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call