Abstract

 Abstract—A parallel algorithm called P-scheme/G is proposed for solving recurrence equations on GPGPU systems. This is based on P-scheme algorithm that has been originally developed for distributed memory multicomputers. In order to achieve a high performance computation on GPGPU systems, our method alleviates branch divergences by reducing the stride data accesses. We also illustrate the effectiveness of the optimal thread configuration for the recurrence equation. Our experiments with GTX 590 show that the implementation of the rearrangement using the shared memory improves the performance by 200\% to 300\% and the validity of the policy of the thread configuration is confirmed for both the constant and the non-constant parameter cases. We achieve the speedup of around 400 as a recurrence equation solver with non-constant parameters.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.