Abstract

Despite decades of activity, parallel computing remains immature. Like much of computer science, advances in the field are driven by a mixture of theoretical insights and technological advances. But in parallel computing, the gap between theory and practice remains disconcertingly wide.Key theoretical concepts in parallel computing were developed in the seventies including P-completeness, PRAMs, boolean circuits, and more. A rich set of algorithms and complexity classes was built on top of these models, and these ideas influenced early thinking about parallel machines. But applied parallel computing today is little in uenced by these concepts.In the past twenty years computational scientists and engineers have embraced parallelism as an essential computational tool for solving ever larger and more complex problems. By the mid-nineties, the computational science community had settled on a computing paradigm involving distributed memory machines programmed with explicit message passing. Although not elegant, this approach allows for portability and the use of inexpensive commodity hardware. This hardware is sufficiently different from the abstract models proposed in the seventies that insights from the theory community have had minimal impact on mainstream parallel computing practice.A variety of alternative theoretical models have been proposed in the interim that are more reflective of hardware realities than their counterparts from the seventies. These include the Bulk Synchronous Processing and LogP complexity models as well as computational abstractions like Active Messages partitioned global address space languages. But these concepts and tools have been consigned to niche roles within the larger MPI-dominated parallel computing world.However, several forces are rearranging the parallel computing landscape and will raise the importance of ideas and techniques from the theory community. The most familiar change is the rise of multi-core processors. The need for node-based parallelism will broaden the parallel computing community, and will also require significant modifications to existing parallel computing practice. But multicore processors are only an artifact of a more fundamental change that Moore's Law is bringing. Silicon is now essentially free, and this will allow for architectural innovation and potentially disruptive new processor designs. The decade of homogeneity in parallel architectures is coming to an end, and the implications for the parallel computing community could be profound.Simultaneous with this architectural revolution is an upheaval in the demands of applications. The computational physics and engineering applications that have dominated parallel computing are comparatively easy to parallelize. They have abundant coarse-grained parallelism that can be expressed in a bulk synchronous style, and they exhibit a high degree of spatial and temporal locality. But unstructured and adaptive grids, and the complexity of multiphysics simulations are already greatly taxing existing machines and programming models. Some newly emerging applications look to be incompatible with current mainstream approaches to parallel computing. As just one example, the analysis of very large social or communication networks has become important in the social sciences, but it is entirely unclear how to perform some of these computations using existing approaches. These new more data-centric and highly unstructured applications will require new ways of expressing and exploiting parallelism.As with the end of the Cretaceous period, these disruptive externalities will reshape the parallel computing ecosystem. New architectures and applications will require fresh thinking from both the applied and theoretical communities. The enormous challenges ahead will provide great opportunities for closer ties between theory and practice, to the betterment of all.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call