Accurate Parallel Floating-Point Accumulation

E Kadric,P Gurniak,A Dehon

doi:10.1109/arith.2013.19

Abstract

Using parallel associative reduction, iterative refinement, and conservative termination detection, we show how to use tree reduce parallelism to compute correctly rounded floating-point sums in O(log N) depth at arbitrary throughput. Our parallel solution shows how we can continue to exploit Moore's Law scaling in transistor count to accelerate floating-point performance even when clock rates remain flat. Empirical evidence suggests our iterative algorithm only requires two tree reduce passes to converge to the accurate sum in virtually all cases. Furthermore, we develop the hardware implementation of a 250 MHz pipelined, native, residue-preserving IEEE-754 double-precision, floating-point adder on a Virtex 6 FPGA that requires only 48% more area than a standard adder without residue. Finally, we show how this module can be used as the base of a streaming accurate floating-point accumulation unit that can be tuned to consume m summands every cycle.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accurate Parallel Floating-Point Accumulation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Accurate Parallel Floating-Point Accumulation
Edin Kadric ... Paul Gurniak
IEEE Transactions on Computers | VOL. 65
Edin Kadric, et. al.Edin Kadric ... Paul Gurniak
01 Nov 2016
IEEE Transactions on Computers | VOL. 65

Floating gate memories: moore's law continues
S.K Lai
-
S.K LaiS.K Lai
25 Apr 2005
25 Apr 2005

Time Moore: Exploiting Moore's Law From The Perspective of Time
Liming Xiu
IEEE Solid-State Circuits Magazine | VOL. 11
Liming XiuLiming Xiu
01 Jan 2019
IEEE Solid-State Circuits Magazine | VOL. 11

Intel's AMT enables rapid processing and info-turn for Intel's DFM test chip vehicle
Hazem Hajj
-
Hazem HajjHazem Hajj
05 Oct 2007
05 Oct 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accurate Parallel Floating-Point Accumulation

Abstract

Talk to us

Similar Papers