Accurate and Efficient Floating Point Summation

James Demmel,Yozo Hida

doi:10.1137/s1064827502407627

Abstract

We present and analyze several simple algorithms for accurately computing the sum of n floating point numbers using a wider accumulator. Let f and F be the number of significant bits in the summands and the accumulator, respectively. Then assuming gradual underflow, no overflow, and round-to-nearest arithmetic, up to approximately 2F-f numbers can be added accurately by simply summing the terms in decreasing order of exponents, yielding a sum correct to within about 1.5 units in the last place (ulps). We apply this result to the floating point formats in the IEEE floating point standard. For example, a dot product of single precision vectors of length at most 33 computed using double precision and sorting is guaranteed correct to nearly 1.5 ulps. If double-extended precision is used, the vector length can be as large as 65,537. We also investigate how the cost of sorting can be reduced or eliminated while retaining accuracy.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accurate and Efficient Floating Point Summation

Abstract

Talk to us

Similar Papers

More From: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics

Lead the way for us

Journal: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics	Publication Date: Jan 1, 2004
Citations: 96

Similar Papers

Fast and Accurate Floating Point Summation with Application to Computational Geometry
James Demmel ... Yozo Hida
Numerical Algorithms | VOL. 37
James Demmel, et. al.James Demmel ... Yozo Hida
01 Dec 2004
Numerical Algorithms | VOL. 37

Area-efficient architectures for double precision multiplier on FPGA, with run-time-reconfigurable dual single precision support
Manish Kumar Jaiswal ... Ray C.C Cheung
Microelectronics Journal | VOL. 44
Manish Kumar Jaiswal, et. al.Manish Kumar Jaiswal ... Ray C.C Cheung
15 Mar 2013
Microelectronics Journal | VOL. 44

Design of Reversible Single Precision and Double Precision Floating Point Multipliers
Anekant Jain ... Jitendra Jain
-
Anekant Jain, et. al.Anekant Jain ... Jitendra Jain
01 Dec 2018
01 Dec 2018

Area efficient run time reconfigurable architecture for double precision multiplier
Shanmugapriyan S ... Sivanandam K
-
Shanmugapriyan S, et. al. Shanmugapriyan S ... Sivanandam K
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accurate and Efficient Floating Point Summation

Abstract

Talk to us

Similar Papers

More From: SIAM journal on scientific computing : a publication of the Society for Industrial and Applied Mathematics