A new parallel extended GCD algorithm is proposed. It matches the best existing parallel integer GCD algorithms of Sorenson and Chor and Goldreich, since it can be achieved in O ϵ ( n / log n ) time using at most n 1 + ϵ processors on CRCW PRAM. Sorenson and Chor and Goldreich both use a modular approach which consider the least significant bits. By contrast, our algorithm only deals with the leading bits of the integers u and v, with u ⩾ v . This approach is more suitable for extended GCD algorithms since the coefficients of the extended version a and b, such that a u + b v = gcd ( u , v ) , are deeply linked with the order of magnitude of the rational v / u and its continuants. Consequently, the computation of such coefficients is much easier.
Read full abstract