On improving the performance of data partitioning oriented parallel irregular reductions

E Gutierrez,E.L Zapata,O Plata

doi:10.1109/empdp.2002.994330

Abstract

Different parallelization techniques for reductions have been classified in this paper into two classes: LPO (loop partitioning-oriented techniques) and DPO (data partitioning-oriented techniques). We have analyzed both classes in terms of a set of performance properties: data locality, memory overhead, parallelism and workload balancing. We propose several techniques to increase the exploited parallelism and to introduce load balancing into a DPO method. Regarding parallelism, the solution is based on the partial expansion of the reduction array. For load balancing, the first technique is generic, as it can deal with any kind of load unbalance present in the problem domain. The second technique handles a special case of load unbalancing appearing when there are a large number of write operations on small regions of the reduction arrays. Efficient implementations of the proposed optimizing solutions for the DWA-LIP (data write affinity-loop index prefetching) DPO method are presented, experimentally tested on static and dynamic kernel codes, and compared with other parallel reduction methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On improving the performance of data partitioning oriented parallel irregular reductions

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Balanced, Locality-Based Parallel Irregular Reductions
Eladio Gutiérrez ... Emilio L Zapata
-
Eladio Gutiérrez, et. al.Eladio Gutiérrez ... Emilio L Zapata
01 Jan 2003
01 Jan 2003

Data partitioning‐based parallel irregular reductions
Eladio Gutiérrez ... Oscar Plata
Concurrency and Computation: Practice and Experience | VOL. 16
Eladio Gutiérrez, et. al.Eladio Gutiérrez ... Oscar Plata
07 Jan 2004
Concurrency and Computation: Practice and Experience | VOL. 16

Evaluation of graph representations with active nodes
Masayuki Numao ... Masamichi Shimura
-
Masayuki Numao, et. al.Masayuki Numao ... Masamichi Shimura
01 Jan 1986
01 Jan 1986

Heuristic approach to allocation of trucks in a transportation system
Takeo Takeno ... Kenichiro Matsui
Computers & Industrial Engineering | VOL. 27
Takeo Takeno, et. al.Takeo Takeno ... Kenichiro Matsui
01 Sep 1994
Computers & Industrial Engineering | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On improving the performance of data partitioning oriented parallel irregular reductions

Abstract

Talk to us

Similar Papers