Abstract
AbstractStencil computations constitute the kernel of many scientific applications. Tiling is often used to improve the performance of stencil codes for data locality and parallelism. However, tiled stencil codes typically require shadow regions, whose management becomes a burden to programmers. In fact, it is often the case that the code required to manage these regions, and in particular their updates, is much longer than the computational kernel of the stencil. As a result, shadow regions usually impact programmers' productivity negatively. In this paper, we describe overlapped tiling, a construct that supports shadow regions in a convenient, flexible and efficient manner in the context of the hierarchically tiled array (HTA) data type. The HTA is a class designed to express algorithms with a high degree of parallelism and/or locality as naturally as possible in terms of tiles. We discuss the syntax and implementation of overlapped HTAs as well as our experience in rewriting parallel and sequential codes using them. The results have been satisfactory in terms of both productivity and performance. For example, overlapped HTAs reduced the number of communication statements in non‐trivial codes by 78% on average while speeding them up. We also examine different implementation options and compare overlapped HTAs with previous approaches. Copyright © 2008 John Wiley & Sons, Ltd.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.