Performance Evaluation of Multi-Core Intel Xeon Processors on Basic Linear Algebra Subprograms

Mostafa I Soliman

doi:10.1142/s0129626409000134

Abstract

Multi-core technology is a natural next step in delivering the benefits of Moore's law to computing platforms. On multi-core processors, the performance of many applications would be improved by parallel processing threads of codes using multi-threading techniques. This paper evaluates the performance of the multi-core Intel Xeon processors on the widely used basic linear algebra subprograms (BLAS). On two dual-core Intel Xeon processors with Hyper-Threading technology, our results show that a performance of around 20 GFLOPS is achieved on Level-3 (matrix-matrix operations) BLAS using multi-threading, SIMD, matrix blocking, and loop unrolling techniques. However, on a small size of Level-2 (matrix-vector operations) and Level-1 (vector operations) BLAS, the use of multi-threading technique speeds down the execution because of the thread creation overheads. Thus the use of Intel SIMD instruction set is the way to improve the performance of single-threaded Level-2 (6 GFLOPS) and Level-1 BLAS (3 GFLOPS). When the problem size becomes large (cannot fit in L2 cache), the performance of the four Xeon cores is less than 2 and 1 GFLOPS on Level-2 and Level-1 BLAS, respectively, even though eight threads are executed in parallel on eight logical processors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance Evaluation of Multi-Core Intel Xeon Processors on Basic Linear Algebra Subprograms

Abstract

Talk to us

Similar Papers

More From: Parallel Processing Letters

Lead the way for us

Journal: Parallel Processing Letters	Publication Date: Mar 1, 2009
Citations: 4

Similar Papers

Performance evaluation of multi-core intel xeon processors on basic linear algebra subprograms
Mostafa I Soliman
-
Mostafa I SolimanMostafa I Soliman
01 Nov 2008
01 Nov 2008

Matrix bidiagonalization on the Trident processor
M.I Soliman ... S.G Sedukhin
-
M.I Soliman, et. al.M.I Soliman ... S.G Sedukhin
22 Apr 2003
22 Apr 2003

Multiple Clustered Core Processors
...
-
, et. al. ...
01 Mar 2006
01 Mar 2006

Fast Blocking of Householder Reflectors on Graphics Processors
Andres E Tomas Dominguez ... Enrique S Quintana Orti
-
Andres E Tomas Dominguez, et. al.Andres E Tomas Dominguez ... Enrique S Quintana Orti
01 Mar 2018
01 Mar 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Evaluation of Multi-Core Intel Xeon Processors on Basic Linear Algebra Subprograms

Abstract

Talk to us

Similar Papers

More From: Parallel Processing Letters