Compiler-Driven Software Speculation for Thread-Level Parallelism

Paraskevas Yiapanis,Gavin Brown,Mikel Luján

doi:10.1145/2821505

Paraskevas Yiapanis, Gavin Brown + Show 1 more

Open Access

https://doi.org/10.1145/2821505

Copy DOI

Abstract

Current parallelizing compilers can tackle applications exercising regular access patterns on arrays or affine indices, where data dependencies can be expressed in a linear form. Unfortunately, there are cases that independence between statements of code cannot be guaranteed and thus the compiler conservatively produces sequential code. Programs that involve extensive pointer use, irregular access patterns, and loops with unknown number of iterations are examples of such cases. This limits the extraction of parallelism in cases where dependencies are rarely or never triggered at runtime. Speculative parallelism refers to methods employed during program execution that aim to produce a valid parallel execution schedule for programs immune to static parallelization. The motivation for this article is to review recent developments in the area of compiler-driven software speculation for thread-level parallelism and how they came about. The article is divided into two parts. In the first part the fundamentals of speculative parallelization for thread-level parallelism are explained along with a design choice categorization for implementing such systems. Design choices include the ways speculative data is handled, how data dependence violations are detected and resolved, how the correct data are made visible to other threads, or how speculative threads are scheduled. The second part is structured around those design choices providing the advances and trends in the literature with reference to key developments in the area. Although the focus of the article is in software speculative parallelization, a section is dedicated for providing the interested reader with pointers and references for exploring similar topics such as hardware thread-level speculation, transactional memory, and automatic parallelization.

Highlights

In recent years, multicore chips became the standard configuration in commercial computing
Loops with unknown number of iterations make it hard to parallelize since there is no information on how to schedule the loop
Static compiler approaches to automatic parallelization can be successful but fail to parallelize code where sufficient information is not known until runtime

Summary

Introduction

Multicore chips became the standard configuration in commercial computing. When subscripted subscripts are used to access array elements, the actual memory locations may not be available until runtime. The work discussed in the rest of this article concerns software implementations of speculative parallelization systems unless otherwise stated. A significant portion of Part I explored the various dimensions to implement a speculative parallelization system: metadata, version management, conflict detection, commit/rollback, and scheduling speculative threads.

Methods

Discussion

Conclusion