Exploiting distributed and shared memory hierarchies with Hitmap

Ana Moreton-Fernandez,Diego R Llanos,Arturo Gonzalez-Escribano

doi:10.1109/hpcsim.2014.6903696

Abstract

Current multicomputers are typically built as interconnected clusters of shared-memory multicore computers. A common programming approach for these clusters is to simply use a message-passing paradigm, launching as many processes as cores available. Nevertheless, to better exploit the scalability of these clusters and highly-parallel multicore systems, it is needed to efficiently use their distributed- and shared-memory hierarchies. This implies to combine different programming paradigms and tools at different levels of the program design. This paper presents an approach to ease the programming for mixed distributed and shared memory parallel computers. The coordination at the distributed memory level is simplified using Hitmap, a library for distributed computing using hierarchical tiling of data structures. We show how this tool can be integrated with shared-memory programming models and automatic code generation tools to efficiently exploit the multicore environment of each multicomputer node. This approach allows to exploit the most appropriate techniques for each model, easily generating multilevel parallel programs that automatically adapt their communication and synchronization structures to the target machine. Our experimental results show how this approach mimics or even improves the best performance results obtained with manually optimized codes using pure MPI or OpenMP models.

Highlights

The polyhedral model has been proved to be a useful tool to transform and generate parallel programs for codes with affine nested loops [1]
In this paper we study the codes generated by the most sophisticated communication scheme introduced so far
This paper presents a model for the run-time cost of the codes generated by a state-of-the-art polyhedral-model technique (FOP scheme), for communication management in a distributed-memory environment

Summary

INTRODUCTION

The polyhedral model has been proved to be a useful tool to transform and generate parallel programs for codes with affine nested loops [1]. The automatically generated codes are capable of coordinating the computation and communication across heterogeneous devices This allows the exploitation of parallelism in heterogeneous clusters with GPUs or other accelerators, which is the current trend to build huge parallel systems [8]. The scale of the machines and problems that can be currently faced, grows by several orders of magnitude comparing with those found in most performance evaluations done previously with distributed-memory polyhedral generated codes. It will continue growing up, with exascale computing being an important research focus.

THE COMMUNICATION SCHEME

COST MODEL

General cost for a distributed loop

Problem size and number of iterations

Distribution policy

Packing stage

Coordination and communication stage

Unpacking stage

Total cost

CASE STUDY

Cost model parametrization

Simulation study

Experimental environment

Results

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploiting distributed and shared memory hierarchies with Hitmap

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jul 1, 2014
Citations: 6	License type: cc-by

Similar Papers

Performance of a mixed shared/distributed memory parallel network simulator
Cameron Kiddle ... Rob Simmonds
-
Cameron Kiddle, et. al.Cameron Kiddle ... Rob Simmonds
01 Jan 2004
01 Jan 2004

Towards Better Shared Memory Programming Models
Phillip B. Gibbons
-
Phillip B. GibbonsPhillip B. Gibbons
01 Jan 1989
01 Jan 1989

Fault Tolerance in Cluster Computing System
Ashwini Patil ... Ankit Shah
-
Ashwini Patil, et. al.Ashwini Patil ... Ankit Shah
01 Oct 2011
01 Oct 2011

A Blocking Algorithm for Parallel 1-D FFT on Shared-Memory Parallel Computers
Daisuke Takahashi
-
Daisuke TakahashiDaisuke Takahashi
01 Jan 2002
01 Jan 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploiting distributed and shared memory hierarchies with Hitmap

Abstract

Highlights

Summary

Talk to us

Similar Papers