TOD-Tree: Task-Overlapped Direct Send Tree Image Compositing for Hybrid MPI Parallelism and GPUs.

A V Pascal Grosset,Charles Hansen,Manasa Prasad,Cameron Christensen,Aaron Knoll

doi:10.1109/tvcg.2016.2542069

A V Pascal Grosset, Charles Hansen + Show 3 more

Open Access

https://doi.org/10.1109/tvcg.2016.2542069

Copy DOI

Abstract

Modern supercomputers have thousands of nodes, each with CPUs and/or GPUs capable of several teraflops. However, the network connecting these nodes is relatively slow, on the order of gigabits per second. For time-critical workloads such as interactive visualization, the bottleneck is no longer computation but communication. In this paper, we present an image compositing algorithm that works on both CPU-only and GPU-accelerated supercomputers and focuses on communication avoidance and overlapping communication with computation at the expense of evenly balancing the workload. The algorithm has three stages: a parallel direct send stage, followed by a tree compositing stage and a gather stage. We compare our algorithm with radix-k and binary-swap from the IceT library in a hybrid OpenMP/MPI setting on the Stampede and Edison supercomputers, show strong scaling results and explain how we generally achieve better performance than these two algorithms. We developed a GPU-based image compositing algorithm where we use CUDA kernels for computation and GPU Direct RDMA for inter-node GPU communication. We tested the algorithm on the Piz Daint GPU-accelerated supercomputer and show that we achieve performance on par with CPUs. Last, we introduce a workflow in which both rendering and compositing are done on the GPU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Visualization and Computer Graphics	Publication Date: Mar 14, 2016
Citations: 14	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

TOD-Tree: Task-Overlapped Direct Send Tree Image Compositing for Hybrid MPI Parallelism and GPUs.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Visualization and Computer Graphics

Lead the way for us

Similar Papers

Torus–Connected Cycles: A Simple and Scalable Topology for Interconnection Networks
Antoine Bossard ... Keiichi Kaneko
International Journal of Applied Mathematics and Computer Science | VOL. 25
Antoine Bossard, et. al.Antoine Bossard ... Keiichi Kaneko
01 Dec 2015
International Journal of Applied Mathematics and Computer Science | VOL. 25

A Web-oriented Framework for Graph Simplification and Interactive Visualization
Guoyong Mao ... Ning Zhang
Journal of Computers | VOL. -
Guoyong Mao, et. al.Guoyong Mao ... Ning Zhang
12 Jan 2013
Journal of Computers | VOL. -

CFDComm: An Optimized Library for Scalable Point-to-Point Communication for General CFD Applications
Sina Haeri ... John S Shrimpton
-
Sina Haeri, et. al.Sina Haeri ... John S Shrimpton
01 Jun 2012
01 Jun 2012

Visualization of state transition graphs

-

18 Nov 2015
18 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TOD-Tree: Task-Overlapped Direct Send Tree Image Compositing for Hybrid MPI Parallelism and GPUs.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Visualization and Computer Graphics