Special Issue on Applications for the Heterogeneous Computing Era

Jiayuan Meng,Toshio Endo

doi:10.1177/1094342014527657

Abstract

As we look forward to the exascale era, heterogeneous parallel machines with accelerators, such as GPUs, FPGAs, and upcoming on-chip accelerator cores, are expected to play a massive role in architecting the largest systems in the world. While there is significant interest in accelerator-based architectures, much of this interest is an artifact of the hype associated with them. This special issue focuses on understanding the implications of accelerators on the architectures and programming environments of future systems. It seeks to ground accelerator research through studies of application kernels or whole applications on such systems, as well as tools and libraries that improve the performance or productivity of applications trying to use these systems. For accelerator-based heterogeneous systems to truly be a successful high-performance computing (HPC) platform, it is important that we obtain a complete picture of HPC applications and learn the opportunities and challenges these architectures raise. We need to learn the characteristics of computational kernels and applications, and how different software stacks impact them, in order to guide future accelerator-based HPC system designs. In this special issue, we presented case studies about accelerating representative kernels and applications on emerging multicore and manycore systems, including Intel MIC (Many Integrated Core) and GPU architectures. We also demonstrated better designs in the programming models to scale the performance of applications running on HPC systems. Finally, we investigated algorithms designs for large-scale systems and studied the power–performance tradeoffs of various optimizations techniques on heterogeneous platforms. In ‘‘Using MIC to accelerate graph traversal’’, Gao et al. describe a highly optimized breadth-first graph traversal algorithm designed for the MIC architecture. The algorithm utilizes both the MIC accelerator and the host CPU, and thus exploits the full capability of the heterogeneous system. Graph traversal is an important kernel for big data analysis, and we believe their optimized algorithm will help other researchers and practitioners in this area. In ‘‘Comparison sorting on hybrid multicore architectures for fixed and variable length keys’’, Banerjee et al. present a hybrid comparison-based sorting algorithm which utilizes a NVidia GPU and an Intel i7 CPU. The algorithm explores ways to divide-andconquer the overall problem. The algorithm achieves a 20% gain over the current best known comparison sorting algorithm. They also use a look-ahead-based approach to sort strings and obtain around 24% performance benefit over the current best known solution. Sorting has been a topic of immense research value and we think the advance in sorting efficiency may have tremendous impact in various types of applications. In ‘‘Composing multiple StarPU applications over heterogeneous machines: a supervised approach’’, Hugo et al. propose an extension of StarPU, a runtime system specifically designed for heterogeneous architectures to allow multiple parallel codes to run concurrently with minimal interference. They introduce a hypervisor that automatically expands or shrinks scheduling contexts (e.g. resource allocation) using feedbacks from the runtime system. Their mechanism can dramatically improve the overall application runtime by 34%. In ‘‘Evaluating the multi-core and many-core architectures through accelerating the 3D LWC stencil’’, You et al. showcase how they accelerate the iterative stencil loops in wave propagation forward modeling, which is a widely used computational method in oil and gas exploration. They experimented with architectures including Intel Sandy Bridge, NVidia Fermi C2070, NVidia Kepler K20x, and the Intel Xeon Phi Coprocessors. Numerous parallel strategies and optimization techniques are employed. The study about crossplatform performance and power analysis is also conducted. In ‘‘Analyzing power efficiency of optimization techniques and algorithm design methods for applications on heterogeneous platforms’’, Ukidave et al. evaluate the power/performance efficiency of different optimizations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Special Issue on Applications for the Heterogeneous Computing Era

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications

Lead the way for us

Similar Papers

An optimized BFS algorithm: A path to load balancing in MIC
Chenxu Wang ... Yutong Lu
-
Chenxu Wang, et. al.Chenxu Wang ... Yutong Lu
01 Oct 2015
01 Oct 2015

Foreword to the special issue of the workshop on high performance computing systems (XVIII Simpósio em Sistemas Computacionais de Alto Desempenho, WSCAD 2017)
César A F De Rose ... Márcio Castro
Concurrency and Computation: Practice and Experience | VOL. 31
César A F De Rose, et. al.César A F De Rose ... Márcio Castro
07 May 2019
Concurrency and Computation: Practice and Experience | VOL. 31

Open Science on Trinity's Knights Landing Partition
Scott Levy ... Kevin Pedretti
-
Scott Levy, et. al.Scott Levy ... Kevin Pedretti
13 Aug 2018
13 Aug 2018

Performance Evaluation and Scalability Analysis of NPB-MZ on Intel Xeon Phi Coprocessor
Yuqian Li ... Yonggang Che
-
Yuqian Li, et. al.Yuqian Li ... Yonggang Che
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Special Issue on Applications for the Heterogeneous Computing Era

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications