Abstract

Reducing the need for users to manually manage the details of work and data distribution is an important goal of high-level many-task runtime systems. For distributed memory platforms this means that the runtime system has to keep track of both fine-grained task dependencies and data residency meta-information. The amount of such meta-information is proportional to the granularity of parallelism which needs to be managed, introducing a trade-off. More precise tracking of data state allows leveraging more opportunities for compute and transfer parallelism, while also introducing more overhead. As such, the fidelity of the information being tracked needs to be managed carefully, ideally without introducing additional latency, communication or substantial compute overhead. We present the “Horizons” approach, designed to fulfill these goals. Specifically, horizons allow for the effective and efficient management of parallelism and the coalescing of previous fine-grained tracking information while maintaining an easily configurable scheduling window with full information precision. As an additional benefit, they provide consistent cluster-wide decision points without requiring any inter-node communication, and effectively cap the size of state tracking data structures even in the presence of problematic access patterns. Experimental evaluation on microbenchmarks and dry runs demonstrates that horizons are effective in keeping the scheduling complexity constant, while their own overhead is negligible—below 10μs\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$10\\, \\upmu {\\rm s}$$\\end{document} per horizon when building a command graph for 512 GPUs. We additionally demonstrate the performance impact of horizons—as well as their low overhead—on a real-world application.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call