Abstract
An optimizing compiler for a data parallel programming language can significantly improve program performance on a massively parallel computing system by incorporating new strategies for allocating array elements to processors. We discuss techniques for automatic layout of arrays in a compiler targeted to SIMD architectures, such as the Connection Machine computer system. Our primary goal is to minimize the cost of moving data among processors. We also attempt to minimize memory usage. Improved array layout may allow more specialized communication operations with lower cost. We discuss the algorithms to effect such improvement and present some typical examples of code fragments that can be improved significantly with respect to memory consumption and by orders of magnitude with respect to execution time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.