Unlike the parallelism residue number systems (RNS) provide for addition and multiplication, RNS division is often performed via a straightforward RDF (i.e., reverse conversion, binary division, and forward-conversion) scheme. Given the lack of any faster-than-RDF RNS divider, we designed one for the popular moduli set {2n + 1, 2n,2n − 1}, via mixed radix representation of the operands and two parallel 2n-bit dividers (vs. the 3n-bit counterpart required by the RDF). However, the ensued RNS integer quotient could be off the correct result by ± 1, which is tolerated in many approximate arithmetic processes. Nevertheless, we obtain the accurate quotient via additional hardware (i.e., six parallel n-bit comparators). Our experimental results, for n = 8(16), as compared to RDF, show 33%(36%) less delay, 3%(12%) less energy, and 35%(43%) less energy-delay product in case of the minimal-error approximate divider and 14%(21%) less delay in case of the exact implementation, at the cost of additional area consumption and power dissipation.
Read full abstract