Abstract

We propose a set of new Fortran reference implementations, based on an algorithm proposed by Kahan, for the Level 1 BLAS routines *NRM2 that compute the Euclidean norm of a real or complex input vector. The principal advantage of these routines over the current offerings is that, rather than losing accuracy as the length of the vector increases, they generate results that are accurate to almost machine precision for vectors of length N < N max where N max depends upon the precision of the floating point arithmetic being used. In addition, we make use of intrinsic modules, introduced in the latest Fortran standards, to detect occurrences of non-finite numbers in the input data and return suitable values as well as setting IEEE floating point status flags as appropriate. A set of C interface routines is also provided to allow simple, portable access to the new routines. To improve execution speed, we advocate a hybrid algorithm; a simple loop is used first and, only if IEEE floating point exception flags signal, do we fall back on Kahan’s algorithm. Since most input vectors are “easy,” i.e., they do not require the sophistication of Kahan’s algorithm, the simple loop improves performance while the use of compensated summation ensures high accuracy. We also report on a comprehensive suite of test problems that has been developed to test both our new implementation and existing codes for both accuracy and the appropriate settings of the IEEE arithmetic status flags.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call