Abstract

The paper presents assessment of Unified Memory performance with data prefetching and memory oversubscription. Several versions of code are used with: standard memory management, standard Unified Memory and optimized Unified Memory with programmer-assisted data prefetching. Evaluation of execution times is provided for four applications: Sobel and image rotation filters, stream image processing and computational fluid dynamic simulation, performed on Pascal and Volta architecture GPUs—NVIDIA GTX 1080 and NVIDIA V100 cards. Furthermore, we evaluate the possibility of allocating more memory than available on GPUs and assess performance of codes using the three aforementioned implementations, including memory oversubscription available in CUDA. Results serve as recommendations and hints for other similar codes regarding expected performance on modern and already widely available GPUs.

Highlights

  • General-purpose computing on graphic processing unit (GPGPU) has become very popular

  • We compared performance of standard Unified Memory implementations as well as Unified Memory with data prefetching to standard implementation with implicit memory copying

  • We investigated performance of applications using Unified Memory for multistream codes as well as using Unified Memory with memory oversubscription

Read more

Summary

Introduction

General-purpose computing on graphic processing unit (GPGPU) has become very popular Significant advancements, both in hardware and in the CUDA API, have been adopted in recent years. The underlying runtime system migrates memory pages between host’s memory and global memory of a GPU according to when the code running on either the CPU or the GPU refers to it. It allows to simplify the programming model considerably and allows reasonably good performance compared to low-level memory management when Unified Memory is not used [28]. Performance of this feature is investigated in this paper as well, compared to the standard CUDA implementation without Unified Memory

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.