Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Charles M Stein,Marco Danelutto,Massimo Torquati,Luiz G Fernandes,Dalvan Griebler,Dinei A Rockenbach,Gabriele Mencagli

doi:10.1002/cpe.5786

Abstract

SummaryStream processing is a parallel paradigm used in many application domains. With the advance of graphics processing units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input elements in microbatches, whose computation is offloaded on the GPU leveraging data parallelism within the same batch of data. Since data elements are continuously received based on the input speed, the bigger the microbatch size the higher the latency to completely buffer it and to start the processing on the device. Unfortunately, stream processing applications often have strict latency requirements that need to find the best size of the microbatches and to adapt it dynamically based on the workload conditions as well as according to the characteristics of the underlying device and network. In this work, we aim at implementing latency‐aware adaptive microbatching techniques and algorithms for streaming compression applications targeting GPUs. The evaluation is conducted using the Lempel‐Ziv‐Storer‐Szymanski compression application considering different input workloads. As a general result of our work, we noticed that algorithms with elastic adaptation factors respond better for stable workloads, while algorithms with narrower targets respond better for highly unbalanced workloads.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Concurrency and Computation: Practice and Experience	Publication Date: May 4, 2020
Citations: 11	License type: cc-by-nc-sa

R Discovery Prime

R Discovery Prime

Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience

Lead the way for us

Similar Papers

Chimera
Jason Jong Kyu Park ... Scott Mahlke
-
Jason Jong Kyu Park, et. al.Jason Jong Kyu Park ... Scott Mahlke
14 Mar 2015
14 Mar 2015

Chimera
Jason Jong Kyu Park ... Scott Mahlke
ACM SIGARCH Computer Architecture News | VOL. 43
Jason Jong Kyu Park, et. al.Jason Jong Kyu Park ... Scott Mahlke
14 Mar 2015
ACM SIGARCH Computer Architecture News | VOL. 43

Chimera
Jason Jong Kyu Park ... Scott Mahlke
ACM SIGPLAN Notices | VOL. 50
Jason Jong Kyu Park, et. al.Jason Jong Kyu Park ... Scott Mahlke
14 Mar 2015
ACM SIGPLAN Notices | VOL. 50

Automated Architecture-Aware Mapping of Streaming Applications Onto GPUs
Andrei Hagiescu ... Rick Siow Mong Goh
-
Andrei Hagiescu, et. al.Andrei Hagiescu ... Rick Siow Mong Goh
01 May 2011
01 May 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience