Abstract

The success of parallel architectures has been limited by the lack of high-level parallel programming languages and useful programming models. The data-parallel model of programming has been demonstrated to be useful and natural on a wide variety of parallel architectures. This dissertation presents a set of formal techniques for compiling high-level languages based on data-parallelism. These techniques have been developed in the context of the high-level language FP*. FP* is a data-parallel dialect of the functional language FP which supports nested collections of data objects, polymorphism and higher-order functions. FP* is suitable for data-parallel programming because its basic data type is an aggregate of other primitive data types, and its primitive functions operate on aggregates as a whole. The compiler translates FP* programs into low-level programs with forall loops, where parallelism arises from simultaneous execution of all loop iterations. The compiler uses an inference system to determine types of data objects and sizes of arrays in the program. Structure inference makes possible compile-time optimizations that reduce synchronization and storage requirements at run-time on parallel machines. High-level languages organized around aggregates tend to suffer due to the creation and copying of large intermediate data structures. On parallel architectures, copying of large data structures may require inter-processor communication, which can be extremely expensive compared to local computation. The FP* compiler significantly minimizes the copying of large data structures, thereby reducing interprocessor communication on parallel computers. The compiler also optimizes data layout to improve load balance of compiled programs. These techniques have been devised in a formal framework. This dissertation presents a formal description of the compiler using a syntactic function that produces low-level data-parallel programs given an FP* function as input. The effects of the compiler optimizations are demonstrated through an implementation on the Connection Machine CM-2. Running times are presented for a set of benchmark programs written in FP* and compiled into CM Fortran using the compiler reported in this dissertation. These timings demonstrate the performance improvements obtained through the compiler optimizations that are formally specified in this thesis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.