This special issue focuses on all aspects of parallel programming on multicore and manycore architectures such as programming models and systems, applications algorithms, performance analysis, and debugging and performance tools. It includes selected articles from the 2020 International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2020), which was colocated with the PPoPP 2020 conference, in San Diego, California, February 22–26, 2020. From high-end servers to mobile phones, multicore and manycore systems are steadily entering every aspect of the information technology landscape. While this advanced technology enables new opportunities for scientific discovery, it also opens new challenges. In particular, although program developers have been trained in parallel programming, most existing parallel programs are still prone to errors such as data race and deadlock and suffer from suboptimal performance. In order to fully explore the computing power of multicore and manycore hardware architectures, innovations in parallel programming models as well as the associated development and execution eco-systems are urgently needed to enable delivery of error-free and high-performance parallel programs. The PMAM 2020 workshop provided a discussion forum for people interested in programming environments, models, tools, and applications specifically designed for parallel multicore and manycore hardware environments. The four outstanding contributions for this special issue were selected from eight workshop presentations. The accepted publications cover several emerging research directions, specifically focusing on programming models and runtime techniques on manycore and heterogeneous computing systems for graph and streaming applications, as summarized below. In “A Self-Adjusting Task Granularity Mechanism for the Java Lifeline-Based Global Load Balancer Library,” Finnerty et al. present a tuning mechanism that dynamically adjusts the sensitive task granularity in the load balancing scheme of X10 for Java parallel programs and demonstrate the improved performance of four applications on two manycore supercomputers. In “A Performance Predictor for Implementation Selection of Parallelized Static and Temporal Graph Algorithms,” Rehman et al. analyze the performance of graph workload with different algorithm and hardware conditions. They propose an inter-implementation predictor to choose the best performing parallel implementation for static and temporal graph benchmarks and inputs when executing on a CPU or a GPU architecture by leveraging analytical and neural network models. In “Sharing Non-Cache-Coherent Memory with Bounded Incoherence,” Ren et al. introduce a memory consistency approach, called bounded incoherence, that enables cached access to shared data-structures in non-cache-coherent memory. The proposed model ensures that updates to memory on one node are visible within at most a bounded amount of time on all other nodes. Test results show 30% performance improvement of the PowerGraph graph processing framework over state-of-the-art distributed approaches. In “Code Generation for Energy-Efficient Execution of Dynamic Streaming Task Graphs on Parallel and Heterogeneous Platforms,” Litzinger et al. enhance the programming model of streaming applications by introducing dynamic elements into the task graphs so that the runtime system can remap tasks and adapt the change of streaming-in task structures at runtime. The authors provide a prototype implementation of the toolchain and demonstrate low overhead and low energy consumption of the proposed remapping technique. We hope that the readers of the PMAM 2020 special issue will find the articles insightful in the field of parallel programming for multicore and manycore systems. The guest editors would like to thank all the reviewers and the Concurrency and Computation Practice and Experience Wiley office staff who worked hard to make this high-quality journal issue happen, while facing challenges from the COVID-19 pandemic. Data sharing is not applicable to this article as no new data were created or analyzed in this study.
Read full abstract