Abstract
As the size of high-performance computing systems grows, the number of events requiring a network reconfiguration, as well as the complexity of each reconfiguration, is likely to increase. In large systems, the probability of component failure is high. At the same time, with more network components, ensuring high utilization of network resources becomes challenging. Reconfiguration in interconnection networks, like InfiniBand (IB), typically involves computation and distribution of a new set of routes in order to maintain connectivity and performance. In general, current routing algorithms do not consider the existing routes in a network when calculating new ones. Such configuration-oblivious routing might result in substantial modifications to the existing paths, and the reconfiguration becomes costly as it potentially involves a large number of source-destination pairs. In this paper, we propose a novel routing algorithm for IB based fat-tree topologies, SlimUpdate. SlimUpdate employs techniques to preserve existing forwarding entries in switches to ensure a minimal routing update, without any performance penalty, and with minimal computational overhead. We present an implementation of SlimUpdate in OpenSM, and compare it with the current de facto fat-tree routing algorithm. Our experiments and simulations show a decrease of up to 80% in the number of total path modifications when using SlimUpdate routing, while achieving similar or even better performance than the fat-tree routing in most reconfiguration scenarios.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have