Abstract
Many numerical problems require a higher computing precision than the one offered by standard floating-point (FP) formats. One common way of extending the precision is to represent numbers in a <i>multiple component</i> format. By using the so-called <i>floating-point expansions</i> , real numbers are represented as the unevaluated sum of standard machine precision FP numbers. This representation offers the simplicity of using directly available, hardware implemented and highly optimized, FP operations. It is used by multiple-precision libraries such as Bailey's QD or the analogue Graphics Processing Units (GPU) tuned version, GQD. In this article we briefly revisit algorithms for adding and multiplying FP expansions, then we introduce and prove new algorithms for normalizing, dividing and square rooting of FP expansions. The new method used for computing the reciprocal <inline-formula><tex-math notation="LaTeX">${a}^{-1}$</tex-math></inline-formula> and the square root <inline-formula><tex-math notation="LaTeX">$\sqrt{a}$</tex-math></inline-formula> of a FP expansion <inline-formula><tex-math notation="LaTeX">$a$</tex-math></inline-formula> is based on an adapted Newton-Raphson iteration where the intermediate calculations are done using “truncated” operations (additions, multiplications) involving FP expansions. We give here a thorough error analysis showing that it allows very accurate computations. More precisely, after <inline-formula><tex-math notation="LaTeX">$q$</tex-math> </inline-formula> iterations, the computed FP expansion <inline-formula><tex-math notation="LaTeX">$x=x_0+\ldots +x_{2^q-1}$</tex-math> </inline-formula> satisfies, for the reciprocal algorithm, the relative error bound: <inline-formula><tex-math notation="LaTeX"> $\left|({x-a^{-1}})/{a^{-1}}\right| \le 2^{-2^q(p-3)-1}$</tex-math></inline-formula> and, respectively, for the square root one: <inline-formula><tex-math notation="LaTeX"> $\left|x-{1}/{\sqrt{a}}\right| \le {2^{-2^q(p-3)-1}}/{\sqrt{a}}$</tex-math></inline-formula> , where <inline-formula> <tex-math notation="LaTeX">$p> 2$</tex-math></inline-formula> is the precision of the FP representation used ( <inline-formula><tex-math notation="LaTeX">$p=24$</tex-math></inline-formula> for single precision and <inline-formula><tex-math notation="LaTeX">$p=53$</tex-math></inline-formula> for double precision).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.