A fundamental operator in modern computational systems has a multitude of highly optimized implementations. In general-purpose systems, the width has grown to 128 bits, putting pressure on designers to use sophisticated carry lookahead or tree adders to maintain throughput while sacrificing area and energy. However, the typical workload mostly exercises the lower 10–15 bits. This leaves many devices on and unused during normal operation, reducing the overall performance. We hypothesize that bit- or digit-serial implementations for arbitrary-length streams represent an opportunity to decrease the overall energy usage while increasing the throughput/area efficiency of the system and verify this hypothesis by constructing an asynchronous digit-serial adder for comparison against its bit-parallel counterparts.
Read full abstract