Nested Data Parallelism Research Articles

Nested data-parallelism (NDP) is a declarative style for programming irregular parallel applications. NDP languages provide language features favoring the NDP style, efficient compilation of NDP programs, and various common NDP operations like parallel maps, filters, and sum-like reductions. In this paper, we describe the implementation of NDP in Parallel ML (PML), part of the Manticore project. Managing the parallel decomposition of work is one of the main challenges of implementing NDP. If the decomposition creates too many small chunks of work, performance will be eroded by too much parallel overhead. If, on the other hand, there are too few large chunks of work, there will be too much sequential processing and processors will sit idle. Recently the technique of Lazy Binary Splitting was proposed for dynamic parallel decomposition of work on flat arrays, with promising results. We adapt Lazy Binary Splitting to parallel processing of binary trees, which we use to represent parallel arrays in PML. We call our technique Lazy Tree Splitting (LTS). One of its main advantages is its performance robustness: per-program tuning is not required to achieve good performance across varying platforms. We describe LTS-based implementations of standard NDP operations, and we present experimental data demonstrating the scalability of LTS across a range of benchmarks.

Read full abstract

This paper generalises the flattening transformation---a technique for the efficient implementation of nested data parallelism---and reconciles it with main stream functional programming. Nested data parallelism is significantly more expressive and convenient to use than the flat data parallelism typically used in conventional parallel languages like High Performance Fortran and C*. The flattening transformation of Blelloch and Sabot is a key technique for the efficient implementation of nested parallelism via flat parallelism, but originally it was severely restricted, as it did not permit general sum types, recursive types, higher-order functions, and separate compilation. Subsequent work, including some of our own, generalised the transformation and allowed higher-order functions and recursive types. In this paper, we take the final step of generalising flattening to cover the full range of types available in modern languages like Haskell and ML; furthermore, we enable the use of separate compilation. In addition, we present a completely new formulation of the transformation, which is based on the standard lambda calculus notation, and replace a previously ad-hoc transformation step by a systematic generic programming technique. First experiments demonstrate the efficiency of our approach.

Read full abstract

Nested Data Parallelism Research Articles

Related Topics

Articles published on Nested Data Parallelism

A language for hierarchical data parallel design-space exploration on GPUs

Nested data-parallelism on the gpu

Work efficient higher-order vectorisation

Lazy tree splitting

More types for nested data parallel programming

Irregular Computations in Fortran – Expression and Implementation Strategies

A new model for integrated nested task and data parallel programming

A Partitioning-Independent Paradigm for Nested Data Parallelism

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Nested Data Parallelism Research Articles

Related Topics

Articles published on Nested Data Parallelism

A language for hierarchical data parallel design-space exploration on GPUs

Nested data-parallelism on the gpu

Work efficient higher-order vectorisation

Lazy tree splitting

More types for nested data parallel programming

Irregular Computations in Fortran – Expression and Implementation Strategies

A new model for integrated nested task and data parallel programming

A Partitioning-Independent Paradigm for Nested Data Parallelism