Automatic Hierarchical Parallelization of Linear Recurrences

Sepideh Maleki,Martin Burtscher

doi:10.1145/3296957.3173168

Abstract

Linear recurrences encompass many fundamental computations including prefix sums and digital filters. Later result values depend on earlier result values in recurrences, making it a challenge to compute them in parallel. We present a new work- and space-efficient algorithm to compute linear recurrences that is amenable to automatic parallelization and suitable for hierarchical massively-parallel architectures such as GPUs. We implemented our approach in a domain-specific code generator that emits optimized CUDA code. Our evaluation shows that, for standard prefix sums and single-stage IIR filters, the generated code reaches the throughput of memory copy for large inputs, which cannot be surpassed. On higher-order prefix sums, it performs nearly as well as the fastest handwritten code from the literature. On tuple-based prefix sums and digital filters, our automatically parallelized code outperforms the fastest prior implementations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Hierarchical Parallelization of Linear Recurrences

Abstract

Talk to us

Similar Papers

More From: ACM SIGPLAN Notices

Lead the way for us

Journal: ACM SIGPLAN Notices	Publication Date: Mar 19, 2018
Citations: 2

Similar Papers

Automatic Hierarchical Parallelization of Linear Recurrences
Sepideh Maleki ... Martin Burtscher
-
Sepideh Maleki, et. al.Sepideh Maleki ... Martin Burtscher
19 Mar 2018
19 Mar 2018

Formal verification of parallel prefix sum and stream compaction algorithms in CUDA
Mohsen Safari ... Marieke Huisman
Theoretical Computer Science | VOL. 912
Mohsen Safari, et. al.Mohsen Safari ... Marieke Huisman
02 Mar 2022
Theoretical Computer Science | VOL. 912

Chapter 11 - Prefix sum (scan): An introduction to work efficiency in parallel algorithms
Li-Wen Chang ... John Owens
Programming Massively Parallel Processors | VOL. -
Li-Wen Chang, et. al.Li-Wen Chang ... John Owens
24 Jun 2022
Programming Massively Parallel Processors | VOL. -

Some Results of Designing an IIR Smoothing Filter with P- Splines
Elena A Kochegurova ... Svetlana V Rybushkina
International Review of Automatic Control (IREACO) | VOL. 12
Elena A Kochegurova, et. al.Elena A Kochegurova ... Svetlana V Rybushkina
31 Jul 2019
International Review of Automatic Control (IREACO) | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Hierarchical Parallelization of Linear Recurrences

Abstract

Talk to us

Similar Papers

More From: ACM SIGPLAN Notices