Abstract

This article reports on experiments from our ongoing project whose goal is to develop a C++ library which supports adaptive and irregular data structures on distributed memory supercomputers. We demonstrate the use of our abstractions in implementing "tree codes" for large-scale N-body simulations. These algorithms require dynamically evolving treelike data structures, as well as load-balancing, both of which are widely believed to make the application difficult and cumbersome to program for distributed-memory machines. The ease of writing the application code on top of our C++ library abstractions (which themselves are application independent), and the low overhead of the resulting C++ code (over hand-crafted C code) supports our belief that object-oriented approaches are eminently suited to programming distributed-memory machines in a manner that (to the applications programmer) is architecture-independent. Our contribution in parallel programming methodology is to identify and encapsulate general classes of communication and load-balancing strategies useful across applications and MIMD architectures. This article reports experimental results from simulations of half a million particles using multiple methods.

Highlights

  • INTRODUCTIONParallel programs are written in either of two styles. In the reactive style of programming, the user specifies the local computation and interaction between individual processors

  • Speaking, parallel programs are written in either of two styles

  • By way of analogy with Fortran90/HPF, we know that whatever distributed data structure we support must have global operators analogous to Fortran90 array intrinsics (e.g., CSHIFT, EOSHIFT) and data distribution directives analogous to BLOCK and CYCLIC strategies in HPF

Read more

Summary

INTRODUCTION

Parallel programs are written in either of two styles. In the reactive style of programming, the user specifies the local computation and interaction between individual processors. By way of analogy with Fortran90/HPF, we know that whatever distributed data structure we support must have global operators analogous to Fortran array intrinsics (e.g., CSHIFT, EOSHIFT) and data distribution directives analogous to BLOCK and CYCLIC strategies in HPF To meet these challenges, we can no longer retain a single-layered application-system interface such as that implied by HPF. The challenges that seem considerably harder for derived data structures than for arrays are to support resolution between global and local naming spaces, as well as fine coordination between structure traversal and per-element computation This style of interface, in turn, requires linguistic mechanisms such as polymorphic classes and functions (called templates inC++, not to be confused with TEMPLATE in HPF). We focus on one type of DDS. a distributed tree called PTREE for discussing the run-time system organization

PTREE: Top Level Overview
Related Work
DISTRIBUTED DATA STRUCTURES
Distribution Module
System Services
Per-Node Functions
Link Traverse
Mail Boxes
Global Operations
Deliver
Insert and Delete
Remapping
Structural Coherence
Problem Description
Distribution Strategy
J Generic Distributed Tree
Per-Node Functions for BH-Tree Class
Structural Modification and ORB Remapping
Some Preliminary Performance Findings
CONCLUDING REMARKS
Findings
A higher degree of coordination between system and application code
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call