Fine-grained bulge-chasing kernels for strongly scalable parallel QR algorithms

L Karlsson,B Kågström,E Wadbro

doi:10.1016/j.parco.2014.04.003

Abstract

The bulge-chasing kernel in the small-bulge multi-shift QR algorithm for the non-symmetric dense eigenvalue problem becomes a sequential bottleneck when the QR algorithm is run in parallel on a multicore platform with shared memory. The duration of each kernel invocation is short, but the critical path of the QR algorithm contains a long sequence of calls to the bulge-chasing kernel. We study the problem of parallelizing the bulge-chasing kernel itself across a handful of processor cores in order to reduce the execution time of the critical path. We propose and evaluate a sequence of four algorithms with varying degrees of complexity and verify that a pipelined algorithm with a slowly shifting block column distribution of the Hessenberg matrix is superior. The load-balancing problem is non-trivial and computational experiments show that the load-balancing scheme has a large impact on the overall performance. We propose two heuristics for the load-balancing problem and also an effective optimization method based on local search. Numerical experiments show that speed-ups are obtained for problems as small as 40x40 on two different multicore architectures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fine-grained bulge-chasing kernels for strongly scalable parallel QR algorithms

Abstract

Talk to us

Similar Papers

More From: Parallel Computing

Lead the way for us

Journal: Parallel Computing	Publication Date: Apr 24, 2014
Citations: 10

Similar Papers

I/O efficient QR and QZ algorithms
Sraban Kumar Mohanty ... Sajith Gopalan
-
Sraban Kumar Mohanty, et. al.Sraban Kumar Mohanty ... Sajith Gopalan
01 Dec 2012
01 Dec 2012

The transmission of shifts and shift blurring in the QR algorithm
David S Watkins
Linear Algebra and its Applications | VOL. 241
David S WatkinsDavid S Watkins
01 Jul 1996
Linear Algebra and its Applications | VOL. 241

On Aggressive Early Deflation in Parallel Variants of the QR Algorithm
Bo Kågström ... Meiyue Shao
-
Bo Kågström, et. al.Bo Kågström ... Meiyue Shao
01 Jan 2012
01 Jan 2012

An Improved Data Packet Capture Method Based on Multicore Platform
Xian Zhang ... Jia Liu
-
Xian Zhang, et. al.Xian Zhang ... Jia Liu
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fine-grained bulge-chasing kernels for strongly scalable parallel QR algorithms

Abstract

Talk to us

Similar Papers

More From: Parallel Computing