Abstract

Quantization is a widely used technique to deploy deep learning models on embedded systems since this technique could reduce the model size and computation dramatically. Many quantization approaches have been proposed in recent years. Some quantization approaches are aggressive which could sufficiently reduce the model size and computation. However, the accuracy could be significantly decreased. To resolve this issue, some research groups have proposed smoother approaches to reduce accuracy loss. However, smoother approaches would use much more resources than aggressive approaches. In our work, we proposed a quantization approach which reduces resource utilization dramatically without losing much accuracy. We have successfully applied our quantization approach to the reservoir computing (RC) system. Compared to the RC system using floating-point numbers, our proposed RC system reduces the resource utilization of BRAM, DSP, Flip-Flop (FF) and Lookup Table (LUT) by 47%, 93%, 93%, and 87%, respectively, while only loses 0.08% accuracy on the NARMA10 dataset. Meanwhile, our proposed RC system uses approximately 45%, 14%, and 21% less BRAM, FF, and LUT respectively than the quantized RC system using other popular quantization approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call