Abstract

Dragonflies are one of the most promising topologies for the Exascale effort for their scalability and cost. Dragonflies achieve very high throughput under uniform traffic, but have a pathological behavior under other regular traffic patterns, some of them very common in HPC applications. A recent study showed that randomization of task placement can make pathological, regular (multi-dimensional stencil) traffic patterns behave similar to uniform traffic.In this work we provide a theoretical model that is able to predict the expected performance of a generic dragonfly network under uniform traffic and characterize performance-optimal dragonflies. We then analyze whether this model can be extended to other patterns by means of benchmarking the performance of multiple such patterns under both contiguous and randomized task placement. We conclude that, although in comparison with contiguous task placement, randomization does lead to a significant improvement in performance for pathological communication patterns, this performance is not on par with that of uniform traffic, but rather half of it.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call