Abstract

This paper presents a compilation technique that performs the automatic parallelization of canonical loops. Canonical loops are a recurring pattern that we have observed in many well known algorithms, such as frequent itemset, K-means and K nearest neighbors. Our compiler translates C code to sequences of stream filters that communicate through a variety of channel types. We analyze code containing canonical loops, separate the data over a cluster of processors and determine suitable communication strategies between these processors. Experiments performed on a cluster of 36 computers show that, for the three algorithms described above, our method produces speed-ups that are almost linear on the number of available processors. These experiments also show that the code automatically generated is competitive when compared to hand tuned programs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call