Concurrent query processing in a GPU-based database system.

Hao Li,Yi-Cheng Tu,Bo Zeng

doi:10.1371/journal.pone.0214720

Hao Li, Yi-Cheng Tu + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0214720

Copy DOI

Journal: PloS one	Publication Date: Apr 16, 2019
Citations: 1	License type: CC BY 4.0

Affiliation: University of South Florida, University of Pittsburgh

Abstract

The unrivaled computing capabilities of modern GPUs meet the demand of processing massive amounts of data seen in many application domains. While traditional HPC systems support applications as standalone entities that occupy entire GPUs, there are GPU-based DBMSs where multiple tasks are meant to be run at the same time in the same device. To that end, system-level resource management mechanisms are needed to fully unleash the computing power of GPUs in large data processing, and there were some researches focusing on it. In our previous work, we explored the single compute-bound kernel modeling on GPUs under NVidia’s CUDA framework and provided an in-depth anatomy of the NVidia’s concurrent kernel execution mechanism (CUDA stream). This paper focuses on resource allocation of multiple GPU applications towards optimization of system throughput in the context of systems. Comparing to earlier studies of enabling concurrent tasks support on GPU such as MultiQx-GPU, we use a different approach that is to control the launching parameters of multiple GPU kernels as provided by compile-time performance modeling as a kernel-level optimization and also a more general pre-processing model with batch-level control to enhance performance. Specifically, we construct a variation of multi-dimensional knapsack model to maximize concurrency in a multi-kernel environment. We present an in-depth analysis of our model and develop an algorithm based on dynamic programming technique to solve the model. We prove the algorithm can find optimal solutions (in terms of thread concurrency) to the problem and bears pseudopolynomial complexity on both time and space. Such results are verified by extensive experiments running on our microbenchmark that consists of real-world GPU queries. Furthermore, solutions identified by our method also significantly reduce the total running time of the workload, as compared to sequential and MultiQx-GPU executions.

Highlights

With the recent development of semiconductor technology, the number of processing units integrated on a chip increases rapidly, resulting in massively parallel computing capability
In our previous work [6], we proposed a GPGPU-based Scientific Data Management System (G-SDMS) that uses Compute Unified Device Architecture (CUDA)-supported GPUs as the platform for query processing in a push-based manner
We present a general scheme in optimizing the concurrency and overall performance of heterogeneous tasks under the Compute Unified Device Architecture (CUDA) environment [19]

Summary

Objectives

We aim to transform the model into a form that is easier to handle via considering the actual environment where our problem is defined. The objective of this study is to allocate resources to concurrent CUDA kernels by configuring their runtime parameters for the purpose of maximizing system performance

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Concurrent query processing in a GPU-based database system.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing.
Hao Li ... Yi-Cheng Tu
Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data | VOL. 2014
Hao Li, et. al.Hao Li ... Yi-Cheng Tu
01 Oct 2014
Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data | VOL. 2014

RankBoost Acceleration on both NVIDIA CUDA and ATI Stream Platforms
Bo Wang ... Ruirui Li
-
Bo Wang, et. al.Bo Wang ... Ruirui Li
01 Jan 2009
01 Jan 2009

Efficient CUDA stream management for multi-DNN real-time inference on embedded GPUs
Weiguang Pang ... Wang Yi
Journal of Systems Architecture | VOL. 139
Weiguang Pang, et. al.Weiguang Pang ... Wang Yi
26 Apr 2023
Journal of Systems Architecture | VOL. 139

Accelerating Turbo Similarity Searching on Multi-cores and Many-cores Platforms
Marwah Haitham Al-Laila ... Nurul Hashimah Ahamed Hassain Malim
-
Marwah Haitham Al-Laila, et. al.Marwah Haitham Al-Laila ... Nurul Hashimah Ahamed Hassain Malim
29 Dec 2015
29 Dec 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Concurrent query processing in a GPU-based database system.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one