Abstract

We have previously presented various plane rotation patterns, which provide stable O( N 2) algorithms for reducing a b-band matrix of order N bordered by p rows and/or columns to ( b + p)-band) form, where b ⩾ 1 and p⩾ 1. By splitting the matrix into two similarly structured submatrices and chasing nonzeros to the corners in two directions, the newly proposed patterns reduce the computational cost by 50% compared to the other existing one-way chasing algorithms. In this paper, we show how these rotation patterns can be efficiently parallelized when reducing a one-bordered bidiagonal matrix to tridiagonal form possibly followed by bidiagonalization. Applications are found in updating total least squares solutions and signal or noise subspaces by means of a partial singular value decomposition. For each scheme, a linear systolic network and a parallel VLSI computing structure are presented. These architectures are able to reduce the overall computing time for the tridiagonalization from O( N 2) to O(N) using O( N) processors. In particular, it is shown that the best two-way chasing parallel implementation reduces the computation time of the tridiagonalization by 50% compared to the one-way chasing parallel implementation, using the same number of processors. If additionally the original bandwidth is restored, then all proposed two-way chasing parallel implementations achieve an 8% reduction in overall computing time compared to the one-way chasing scheme.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call