Abstract
OpenACC has been touted as a API designed to make GPGPU programming accessible to scientific programmers, but to date, no studies have attempted to verify this quantitatively. In this paper, we conduct an empirical investigation of program productivity comparisons between OpenACC and CUDA in the programming time, the execution time and the analysis of independence of OpenACC model in high performance problems. Our results show that, for our programs and our subject pool, this claim is true. We created two assignments called Machine Problem 3(MP3) and Machine Problem 4(MP4) in the classroom environment and instrumented the WebCode website developed by ourselves to record details of students’ coding process. Three hypotheses were supported by the statistical data: for the same parallelizable problem, (1) the OpenACC programming time is at least 37% shorter than CUDA; (2) the CUDA running speed is 9x faster than OpenACC; (3) the OpenACC development work is not significantly affected by previous CUDA experience
Highlights
High-performance computing has been applied to more and more scientific research applications such as chemical reaction processes, explosion processes and fluid dynamic processes
During the comparison of OpenACC and CUDA, we mainly investigate the question about whether the OpenACC model is more efficient than CUDA
(3) We investigate whether OpenACC programming work is affected by previous CUDA experience, whereby a programmer is required to accelerate the same section of serial code first with CUDA and use OpenACC to do it again
Summary
High-performance computing has been applied to more and more scientific research applications such as chemical reaction processes, explosion processes and fluid dynamic processes. The CUDA programming framework requires programmers to fully grasp solid knowledge of software and hardware It is an error-prone and time-consuming task if legacy code needs to be rewritten by CUDA. The fully controllable parallel language CUDA is the most representative one among high performance parallel languages because it allows users to schedule resources by utilizing optimization technologies such as kernel parameters tuning, the use of local memory, data layout in memory, and CPU-GPU data transfer avoidance. This “controllability” advantage requires users to grasp solid hardware knowledge and debugging skills. In this way we can investigate whether the OpenACC programming work is affected by previous CUDA experience
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Computer Science, Engineering and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.