Task-based Cholesky decomposition on Xeon Phi architectures using OpenMP

Jack Dongarra,Piotr Luszczek,Joseph Dorris,Asim Yarkhan,Jakub Kurzak

doi:10.1504/ijcse.2017.10011398

Abstract

The increasing number of computational cores in modern many-core processors, as represented by the Intel Xeon Phi architectures, has created the need for an open-source, high performance and scalable task-based dense linear algebra package that can efficiently use this type of many-core hardware. In this paper, we examined the design modifications necessary when porting PLASMA, a task-based dense linear algebra library, run effectively on two generations of Intel's Xeon Phi architecture, known as knights corner (KNC) and knights landing (KNL). First, we modified PLASMA's tiled Cholesky decomposition to use OpenMP tasks for its scheduling mechanism to enable Xeon Phi compatibility. We then compared the performance of our modified code to that of the original dynamic scheduler running on an Intel Xeon Sandy Bridge CPU. Finally, we looked at the performance of the OpenMP tiled Cholesky decomposition on knights corner and knights landing processors. We detail the optimisations required to obtain performance on these platforms and compare with the highly tuned Intel MKL math library.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Task-based Cholesky decomposition on Xeon Phi architectures using OpenMP

Abstract

Talk to us

Similar Papers

More From: International Journal of Computational Science and Engineering

Lead the way for us

Similar Papers

Analytical Performance Modeling and Validation of Intel's Xeon Phi Architecture
Sudheer Chunduri ... Venkatram Vishwanath
-
Sudheer Chunduri, et. al.Sudheer Chunduri ... Venkatram Vishwanath
15 May 2017
15 May 2017

The Power-Performance Tradeoffs of the Intel Xeon Phi on HPC Applications
Bo Li ... Hung-Ching Chang
-
Bo Li, et. al.Bo Li ... Hung-Ching Chang
01 May 2014
01 May 2014

Challenges on Porting Lattice Boltzmann Method on Accelerators
Claudio Schepke ... João V F Lima
-
Claudio Schepke, et. al.Claudio Schepke ... João V F Lima
01 Jan 2018
01 Jan 2018

Embree ray tracing kernels for CPUs and the Xeon Phi architecture
Sven Woop ... Louis Feng
-
Sven Woop, et. al.Sven Woop ... Louis Feng
21 Jul 2013
21 Jul 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Task-based Cholesky decomposition on Xeon Phi architectures using OpenMP

Abstract

Talk to us

Similar Papers

More From: International Journal of Computational Science and Engineering