Abstract

Problems that involve large and sparse linear systems are ubiquitous in scientific computing, and there are strong needs to accelerate the solution processes. Hybrid CPU–GPU systems have recently become a new platform trend with powerful computing capabilities. However, it is not clear how such systems can accelerate the solvers. We study how to make the best use of the CPU and the GPU to minimize the total time required to solve symmetric positive definite systems using the multifrontal method. We analyze the computation and communication costs of the multifrontal method on such hybrid systems to build up timing performance models. Workload distribution algorithms are proposed to determine if a frontal matrix should be factored on the CPU or on the GPU to minimize the total execution time of the overall computation. We provide theoretical analyses and numerical results to illustrate the characteristics and efficiency of the proposed algorithms. Because the performance models and workload distribution algorithms can accommodate different CPUs and GPUs adaptively, we expect the applicability and significance of these techniques to continue to grow as heterogeneous hardware and software evolve.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.