Abstract

The growth in the size of deep neural network (DNN) models poses both computational and memory challenges to the efficient and effective implementation of DNNs on platforms with limited hardware resources. Our work on segmented logarithmic (SegLog) quantization, adopting both base-2 and base- $\sqrt {2}$ logarithmic encoding, is able to reduce inference cost with a little accuracy penalty. However, weight distribution varies among layers in different DNN models, and requires different base-2 : base- $\sqrt {2}$ ratios to reach the best accuracy. This means different hardware designs for the decoding and computing parts are required. This paper extends the idea of SegLog quantization by using layer-wise base-2 : base- $\sqrt {2}$ ratio on weight quantization. The proposed base-reconfigurable segmented logarithmic (BRSLog) quantization is able to achieve 6.4x weight compression with 1.66% Top-5 accuracy drop on AlexNet at 5-bit resolution. An arithmetic element supporting BRSLog-quantified DNN inference is proposed to adapt to different base-2 : base- $\sqrt {2}$ ratios. With $\sqrt {2}$ approximation, the resource-consuming multipliers can be replaced by shifters and adders with only 0.54% accuracy penalty. The proposed arithmetic element is simulated in UMC 55nm Low Power Process, and it is 50.42% smaller in area and 55.60% lower in power consumption than the widely-used 16-bit fixed-point multiplier. Compared with equivalent SegLog arithmetic element designed for fixed base-2 : base- $\sqrt {2}$ ratio, the base-reconfigurable part only increases the area by 22.96 μm2 and energy cost by 2.6 μW.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call