Abstract

Modern GPUs can achieve high computing power at low cost, but still requires much time and effort. Tridiagonal system and scan solvers are one example of widely used algorithms which can take advantage of these devices. In this article, one tridiagonal system solver and two scan primitive operators are implemented on CUDA GPUs. To do so, a tuning strategy based on three phases is developed. Additionally, a performance analysis is performed for two different CUDA GPU architectures, resulting in a huge improvement with respect to the state of the art.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call