Compilers and parallel architectures (abstract only)

Roger Olmstead

doi:10.1145/322917.323085

Abstract

The parallel optimizing compiler is offered as the only viable means of fully exploiting the power of parallel architectures and applying it to mainstream computing problems. In this context, “mainstream” includes -but should not be limited to- scientific computing, which generally implies the solving of partial differential equations, fast Fourier transforms, matrix manipulations, etc. Basically our motivation is to provide the fast execution of a broad class of programs.The requirements of general parallel computing are examined from an engineering perspective and architectures are evaluated in the context of accepted standards, These standards include cost, reliability/fault tolerance, and performance. To simplify the evaluation process, the standards are applied to classes rather than individual architectures. Two classes of architectures exist: algorithmically and computationally specialized architectures (e.g., systolic arrays, SIMD machines, associative processors) and computationally generalized architectures (e.g., MIMD homogeneous multiprocessors with shared central memory). As each standard is applied, generalized architectures invariably prove superior. Although some algorithmically specialized computers, (e.g., systolic arrays) are admittedly less expensive than their more generalized counterparts (though this is not the case with most SIMD machines), total system cost rapidly becomes prohibitive when one considers the vast number of possible algorithms encountered in solving mainstream computing problems. Reliability also favors the generalized machine. Clearly, a machine rich in data paths and redundancy offers superior fault tolerance. Concerning performance, while it is true that algorithmically specialized architectures can occasionally provide nearly optimal solutions to certain specific problems, generalized architectures can produce both high-throughput and rapid turnaround. Indeed, a generalized machine might make use of one or more algorithmically specialized architectures, much like a standard system maintains a library of specialized software routines. In sum, specialized architectural approaches are fundamentally flawed and not sufficient to solve mainstream computing problems.Having opted for generalized parallel architectures, two approaches for their exploitation are evaluated: parallel languages and parallel optimizing compilers. Languages include both those that were designed for parallel programming, like Occam, as well as those sequential languages with added parallel constructs. Standards, similar to those employed above but modified for application to software, are used to evaluate each approach. These standards also include cost and performance, but in the place of reliability, maintainability, and portability are substituted. When examined in light of these standards, the parallel language approach is shown to be inferior in nearly every case. First, software science has shown, empirically, that increasing the quantity of instructions required to perform a given task has a direct, and negative, effect on both programmer productivity and software quality. Thus, the use of additional constructs required to express an algorithm using a parallel language has an adverse effect on the overall cost of the program; and although hand coded programs often more efficient than their compiled counterparts, industry studies have shown that the gain in performance attributable to hand coding is less than twenty percent. Programs coded in a parallel language are also more difficult to maintain. For example, a program might be modified in such a way that the original process-partitioning is no longer optimal; still worse, it may no longer function at all. This problem is not encountered when utilizing sequential programming languages. Concerning portability, if the source program written in a parallel language is moved to a machine with a different architecture (i.e., one having dissimilar process creation, synchronization, and communication costs), it will be less efficient and, more likely, incorrect. It becomes obvious that the criticisms leveled against parallel languages are similar to those favoring the use of high order, rather than assembly languages. This issue points to an underlying philosophical tenet: that is, humans should be freed, whenever possible, from concern over details that can best be left to the automation. Finally, and most importantly, recent research in psycho-linguistics has demonstrated that the structure of natural language is fundamentally sequential. The human brain is inherently designed to map even parallel ideas onto a sequential medium. Thus, the parallel language approach is inimical to human cognitive processes and its application for widespread use in parallel computing is extremely limited.It is clear then, that our goal should be the development of powerful, generalized parallel architectures and the design of compilers capable of extracting the maximum amount of parallelism inherent in a program.

Full Text