We present a new parallel algorithm for computing a least-squares solution to a sparse overdetermined system of linear equations A x = b such that the m × n matrix A is sparse and the graph, G = ( V, E), of the matrix H= I A T A O has an s( m + n)-separator family, i.e. either | V| < n 0 for a fixed constant n 0 or, by deleting a separator subset S of vertices of size ⩽ s( m + n), G can be partitioned into two disconnected subgraphs having vertex sets V 1, V 2 of size ⩽ 2 3 (m + n) , and each of the two resulting subgraphs induced by the vertex sets S U ́ V i, i = 1, 2 , can be recursively s(|S U ́ V i|) -separated in a similar way. Our algorithm uses O(log ( m + n) log 2 s ( m + n)) steps and ⩽ s 3( m + n) processors; it relies on our recent parallel algorithm for solving sparse linear systems and has several immediate applications of interest, in particular to mathematical programming, to sparse nonsymmetric systems of linear equations and to the path algebra computations. We most closely examine the impact on the linear programming problem (LPP) which requires maximizing c T y subject to A T y ⩽ b, y ⩾ 0 , where A is an m × n matrix. Hereafter it is assumed that m ⩾ n. The recent algorithm by Karmarkar gives the best-known upper estimate [ O ( m 3.5 L) arithmetic operations, where L is the input size] for the cost of the solution of this problem in the worst case. We prove an asymptotic improvement of that result in the case where the graph of the associated matrix H has an s ( m + n)-separator family; then our algorithm can be implemented using O ( mL log m log 2 s ( m + n)) parallel arithmetic steps, s 3 ( m + n) processors and a total of O ( mLs 3 ( m + n) log m log 2 s ( m + n)) arithmetic operations. In many cases of practical importance this is a considerable improvement on the known estimates: for example, s ( m + n) = √8 ( m + n) if G is planar [as occurs in many operations research applications; for instance, in the problem of computing the maximum multicommodity flow with a bounded number of commodities in a network having an s ( m + n)-separator family], so that the processor bound is only 8 √8 ( m + n) 1.5 and the total number of arithmetic steps is O ( m 2.5 L). Similarly, Karmakar's algorithm and the known algorithms for the solution of overdetermined linear systems are accelerated in the case of dense input matrices via our recent parallel algorithms for the inversion of dense k × k matrices using O (log 2 k) steps and k 3 processors. Combined with a modification of Karmarkar's algorithm, this implies solution of the LPP using O ( Lm log 2 m) steps and m 2.5 processors. The stated results promise some important practical applications. Theoretically, the above processor bounds can be reduced for dense matrix inversion to o ( k 2.5) and for the LPP to o ( m 2.165) in the dense case and to o ( s 2.5 ( m + n)) in the sparse case (preserving the same number of parallel steps); this also decreases the known sequential time bound for the LPP by a factor of m 0.335, i.e. to O ( Lm 3.165).
Read full abstract