Abstract

Efficiency of matrix applications in parallel processing environments relies on two factors: speed of primitive matrix operations and layout of distributed arrays. Good array layout improves locality and reduces communication overheads. Array alignment is especially important, being a minimum requirement for locality. Existing matrix programming environments either require manual alignment, which compromises the simplicity of use, or resort to some default settings, which sacrifices performance. Techniques for automatic alignment have been proposed, but their use is not widespread, and their practical significance has not been sufficiently examined. We present an experimental evaluation of an alignment optimization technique implemented in a parallelizing compiler for Matlab scripts. We have measured the performance of five applications on two parallel architectures. The significance of alignment optimization is demonstrated by 43% average improvement in performance and doubling the speed in some realistic cases. This optimization technique enabled ordinary Matlab scripts to run at a similar speed to hand-coded PBLAS implementations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call