Abstract

The parallel preconditioned conjugate gradient method (CGM) is often used in adaptive FEMs and has a critical impact on the performance. This article proposes a method for dynamically balancing the computational load of this CGM between CPU and GPU. For the determination of the optimal balance of the computational load on CPU and GPU, an execution time model for the CGM is developed which considers the different execution speeds of the two kinds of processing units. The model relies on data-specific and machine-specific parameters which are both determined at runtime. The accuracy of the model is verified in experiments. This auto-tuning-based approach for CPU/GPU collaboration enables significant performance benefits compared to CPU-only or GPU-only execution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call