Abstract
Loop scheduling is an important issue in the development of high performance multiprocessors. As modern multiprocessors have high and non-uniform memory access (NUMA) costs, the communication costs dominate the execution of parallel programs. Previous affinity algorithms perform better than dynamic algorithms under non-clustered NUMA multiprocessors, but they suffer heavy overheads when migrating work load under clustered NUMA machines. In this paper, we propose a new loop scheduling policy, hierarchical policy, to improve various affinity scheduling algorithms (AFSs) for clustered NUMA machines. We cyclically distribute the iteration chunks to clusters. When imbalance occurs, the migration of iterations is carried on hierarchically. We use hierarchical policy to improve AFS and modified AFS (MAFS), and we call them Hierarchical AFS (HAFS) and Hierarchical MAFS (HMAFS), respectively. AFS uses a deterministic assignment policy to assign repeated executions of loop iteration to the same processor. MAFS modifies the migration policy of AFS, and reduces the number of synchronization operations. We confirm our idea by running many applications under a clustered NUMA simulator. Our experimental result shows that hierarchical policy reduces the inter-cluster remote memory accesses, decreases the locks to the queues, and effectively balances the work load. We also show that HMAFS is the best choice among these algorithms in most cases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.