Optimized CM Fortran compiler for the Connection Machine computer

G Sabot

doi:10.1109/hicss.1992.183289

Abstract

Describes the techniques that are used in the CM Fortran 1.0 compiler to map the fine-grained array parallelism of Fortran 90 onto the CM-2 architecture. The compiler views the parallel hardware at a much lower level of detail than did previous CM-2 compilers, which had targeted a function library named Paris. In the slicewise machine model used by CM Fortran 1.0, the FPUs, their registers, and the memory hierarchy are directly exposed to the compiler. Thus, the CM-2 target machine is not 64K simple bit-serial processors. Rather, the target is a machine containing 2K PEs (processing elements), where each PE is both superpipelined and superscalar. The compiler uses data distribution to spread the problem out among the 2K processors. A new compiler phase is used to separate the code that runs on the two types of processors in the CM: the parallel PEs, which execute a new RISC-like instruction set called PEAC, and the scalar front-end processor, which executes SPARC or VAX assembler code. The pipelines in PEs are filled by using conventional vector processing techniques along with a new, RISC-like vector instruction set. An innovative scheduler overlaps the execution of a number of RISC operations. This new compiler has greatly increased the performance of Fortran codes on the CM-2 on many important computation kernels, such as climate modeling, seismic processing, and hydrodynamics simulations. >

Full Text