Abstract
The EM-4 is a supercomputer that offers very fast interprocessor communication and support for multithreading. In this paper we demonstrate that the EM-4, together with an automatic parallelization technique referred to as Data-Distributed Execution (DDE), offers a computing environment in which large portions of scientific code can be executed without the need for any explicit parallelism. DDE exploits iteration-level parallelism in loops operating over arrays. It performs data-dependency analysis, based on which arrays are distributed over the different local memories. The code is then transformed to "follow" the data distribution by spawning each loop on all PEs concurrently but modifying its boundary conditions so that each operates mostly on the local subranges of the data, thus reducing remote accesses to a minimum. The approach has been tested on the EM-4 by implementing several benchmark programs representative of common scientific applications. The experiments show that high speedup is achievable by automatic parallelization of conventional Fortran-like programs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.