Speeding up irregular applications in shared-memory multiprocessors

Zheng Zhang,Josep Torrellas

doi:10.1145/223982.224423

Abstract

While many parallel applications exhibit good spatial locality, other important codes in areas like graph problem-solving or CAD do not. Often, these irregular codes contain small records accessed via pointers. Consequently, while the former applications benefit from long cache lines, the latter prefer short lines. One good solution is to combine short lines with prefetching. In this way, each application can exploit the amount of spatial locality that it has. However, prefetching, if provided, should also work for the irregular codes. This paper presents a new prefetching scheme that, while usable by regular applications, is specifically targeted to irregular ones: Memory Binding and Group Prefetching.The idea is to hardware-bind and prefetch together groups of data that the programmer suggests are strongly related to each other. Examples are the different fields in a record or two records linked by a permanent pointer. This prefetching scheme, combined with short cache lines, results in a memory hierarchy design that can be exploited by both regular and irregular applications. Overall, it is better to use a system with short lines (16-32 bytes) and our prefetching than a system with long lines (128 bytes) with or without our prefetching. The former system runs 6 out of 7 Splash-class applications faster. In particular, some of the most irregular applications run 25-40% faster.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speeding up irregular applications in shared-memory multiprocessors

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speeding up irregular applications in shared-memory multiprocessors
Zheng Zhang ... Josep Torrellas
ACM SIGARCH Computer Architecture News | VOL. 23
Zheng Zhang, et. al.Zheng Zhang ... Josep Torrellas
01 May 1995
ACM SIGARCH Computer Architecture News | VOL. 23

The induction of abortion and the priming of the cervix with prostaglandin F2 alpha and prostaglandin E2 by intra-amniotic, extra-amniotic and intra-cervical application (author's transl)
W Schmidt ... F Kubli
Geburtshilfe und Frauenheilkunde | VOL. 42
W Schmidt, et. al.W Schmidt ... F Kubli
01 Feb 1982
Geburtshilfe und Frauenheilkunde | VOL. 42

Exploring manycore multinode systems for irregular applications with FPGA prototyping
Marco Ceriani ... Gianluca Palermo
-
Marco Ceriani, et. al.Marco Ceriani ... Gianluca Palermo
01 Aug 2013
01 Aug 2013

Effectiveness of simple memory models for performance prediction
I Chihaia ... T Gross
-
I Chihaia, et. al.I Chihaia ... T Gross
10 Mar 2004
10 Mar 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speeding up irregular applications in shared-memory multiprocessors

Abstract

Talk to us

Similar Papers