Explicit Fourth-Order Runge\u2013Kutta Method on Intel Xeon Phi Coprocessor

Beata Bylina,Joanna Potiopa

doi:10.1007/s10766-016-0458-x

Beata Bylina, Joanna Potiopa

Open Access

https://doi.org/10.1007/s10766-016-0458-x

Copy DOI

Abstract

This paper concerns an Intel Xeon Phi implementation of the explicit fourth-order Runge–Kutta method (RK4) for very sparse matrices with very short rows. Such matrices arise during Markovian modeling of computer and telecommunication networks. In this work an implementation based on Intel Math Kernel Library (Intel MKL) routines and the authors’ own implementation, both using the CSR storage scheme and working on Intel Xeon Phi, were investigated. The implementation based on the Intel MKL library uses the high-performance BLAS and Sparse BLAS routines. In our application we focus on OpenMP style programming. We implement SpMV operation and vector addition using the basic optimizing techniques and the vectorization. We evaluate our approach in native and offload modes for various number of cores and thread allocation affinities. Both implementations (based on Intel MKL and made by the authors) were compared in respect of the time, the speedup and the performance. The numerical experiments on Intel Xeon Phi show that the performance of authors’ implementation is very promising and gives a gain of up to two times compared to the multithreaded implementation (based on Intel MKL) running on CPU (Intel Xeon processor) and even three times in comparison with the application which uses Intel MKL on Intel Xeon Phi.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Parallel Programming	Publication Date: Sep 29, 2016
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Explicit Fourth-Order Runge\u2013Kutta Method on Intel Xeon Phi Coprocessor

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel Programming

Lead the way for us

Similar Papers

The Exploration of Pervasive and Fine-Grained Parallel Model Applied on Intel Xeon Phi Coprocessor
Christophe Calvin ... Fan Ye
-
Christophe Calvin, et. al.Christophe Calvin ... Fan Ye
01 Oct 2013
01 Oct 2013

On the Mitigation of Cache Hostile Memory Access Patterns on Many-Core CPU Architectures
Tom Deakin ... Simon Mcintosh-Smith
-
Tom Deakin, et. al.Tom Deakin ... Simon Mcintosh-Smith
01 Jan 2017
01 Jan 2017

Toward a BLAS library truly portable across different accelerator types
Eduardo Rodriguez-Gutiez ... Arturo Gonzalez-Escribano
The Journal of Supercomputing | VOL. 75
Eduardo Rodriguez-Gutiez, et. al.Eduardo Rodriguez-Gutiez ... Arturo Gonzalez-Escribano
10 Jun 2019
The Journal of Supercomputing | VOL. 75

Parallel independent FFT implementation on intel processors and Xeon phi for LTE and OFDM systems
Mounir Khelifi ... Yvon Savaria
-
Mounir Khelifi, et. al.Mounir Khelifi ... Yvon Savaria
01 Oct 2015
01 Oct 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Explicit Fourth-Order Runge\u2013Kutta Method on Intel Xeon Phi Coprocessor

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel Programming