Heterogeneous tasks and conduits framework for rapid application portability and deployment

James Brock,Mark Niedre,Miriam Leeser

doi:10.1109/inpar.2012.6339588

Abstract

Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programming models. Determining the system architecture best suited to an application or deploying an application that is portable across a number of different platforms is increasingly complex and error prone within this rapidly increasing and evolving design space. Quickly and easily designing portable, high-performance applications that can function and maintain their correctness properly across these widely varied systems has become paramount. To deal with these programming challenges, there is a great need for new models and tools to be developed. One example is MIT Lincoln Laboratory's Parallel Vector Tile Optimizing Library (PVTOL) which simplifies the task of developing software in C++ for these complex systems. This work extends the Tasks and Conduits framework in PVTOL to support GPU architectures and other heterogeneous platforms supported by the NVIDIA CUDA and OpenCL programming models. This allows the rapid portability of applications to a very wide range of architectures and clusters. Using this framework, porting applications from a single CPU core to a GPU requires a change of only 5 source lines of code (SLOC) in addition to the CUDA or OpenCL kernel. Using GPU-PVTOL we have achieved 22x speedup in an application of Monte Carlo simulations of photon propagation through a biological medium, and a 60x speedup of a 3D cone beam computed tomography (CT) image reconstruction algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Heterogeneous tasks and conduits framework for rapid application portability and deployment

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Code complexity versus performance for GPU-accelerated scientific applications
...
-
, et. al. ...
13 Nov 2016
13 Nov 2016

An FPGA Memory Hierarchy for High-level Synthesized OpenCL Kernels
Hsiang-Yu Tseng ... Sheng-De Wang
-
Hsiang-Yu Tseng, et. al.Hsiang-Yu Tseng ... Sheng-De Wang
01 Aug 2015
01 Aug 2015

Coarse Grained FPGA Overlay for Rapid Just-In-Time Accelerator Compilation
Abhishek Kumar Jain ... Suhaib A Fahmy
IEEE Transactions on Parallel and Distributed Systems | VOL. 33
Abhishek Kumar Jain, et. al.Abhishek Kumar Jain ... Suhaib A Fahmy
01 Jun 2022
IEEE Transactions on Parallel and Distributed Systems | VOL. 33

Efficient control flow restructuring for GPUs
Nico Reissmann ... Helge Bahmann
-
Nico Reissmann, et. al.Nico Reissmann ... Helge Bahmann
01 Jul 2016
01 Jul 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Heterogeneous tasks and conduits framework for rapid application portability and deployment

Abstract

Talk to us

Similar Papers