Parallel Scheduling of Task Trees with Limited Memory

Lionel Eyraud-Dubois,Frédéric Vivien,Oliver Sinnen,Loris Marchal

doi:10.1145/2779052

Abstract

This article investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only be removed from memory after the completion of the task that uses it as an input data. Such trees arise in the multifrontal method of sparse matrix factorization. The peak memory needed for the processing of the entire tree depends on the execution order of the tasks. With one processor, the objective of the tree traversal is to minimize the required memory. This problem was well studied, and optimal polynomial algorithms were proposed. Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With multiple processors comes the additional objective to minimize the time needed to traverse the tree—that is, to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide inapproximability results even for unit weight trees. We design a series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan. Some of these heuristics are able to process a tree while keeping the memory usage under a given memory limit. The different heuristics are evaluated in an extensive experimental evaluation using realistic trees.

Highlights

Parallel workloads are often modeled as task graphs, where nodes represent tasks and edges represent the dependencies between tasks
A series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan; some of these heuristics are guaranteed to keep the memory under a given memory limit
In this study we have investigated the scheduling of tree-shaped task graphs onto multiple processors under a given memory limit and with the objective to minimize the makespan

Summary

Introduction

Parallel workloads are often modeled as task graphs, where nodes represent tasks and edges represent the dependencies between tasks. Efforts have been made to design dynamic schedulers that take into account dynamic pivoting (which impacts the weights of edges and nodes) when scheduling elimination trees with strong memory constraints [9], or to consider both task and tree parallelism with memory constraints [1]. While these studies try to optimize memory management in existing parallel solvers, we aim at designing a simple model to study the fundamental underlying scheduling problem

Objectives

Results

Conclusion