Abstract

We address the problem of designing efficient and scalable hardware-algorithms for computing the sum and prefix sums of a w/sup k/-bit, (k/spl ges/2), sequence using as basic building blocks linear arrays of at most w/sup 2/ shift switches, where w is a small power of 2. An immediate consequence of this feature is that in our designs broadcasts are limited to buses of length at most w/sup 2/. We adopt a VLSI delay model where the "length" of a bus is proportional with the number of devices on the bus. We begin by discussing a hardware-algorithm that computes the sum of a w/sup k/-bit binary sequence in the time of 2k-2 broadcasts, while the corresponding prefix sums can be computed in the time of 3k-4 broadcasts. Quite remarkably, in spite of the fact that our hardware-algorithm uses only linear arrays of size at most w/sup 2/, the total number of broadcasts involved is less than three times the number required by an "ideal" design. We then go on to propose a second hardware-algorithm, operating in pipelined fashion, that computes the sum of a kw/sup 2/-bit binary sequence in the time of 3k+[log/sub w/ k]=3 broadcasts. Using this design, the corresponding prefix sums can be computed in the time of 4k+[log/sub w/ k]-5 broadcasts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call