Abstract

This paper proposes a design methodology for a compact edge vision transformer (ViT) Computation-in-Memory (CiM). ViT has attracted much attention for its high inference accuracy. However, to achieve high inference accuracy, the conventional ViT requires fine-tuning many parameters with pre-trained models on large datasets and a large number of matrix multiplications in inference. Thus, to map ViT to non-volatile memory (NVM)-based CiM compactly for edge applications (IoT/Mobile devices) in inference, this paper analyses fine-tuning in training, clipping, and quantization in inference. The proposed compact edge ViT CiM can be optimized by three design methods according to use cases considering the required fine-tuning time, ease of setting memory bit precision, and memory error tolerance of ViT CiM. As a result, in CIFAR-10, the most compact type successfully reduces the total memory size of ViT by 85.8% compared with the conventional ViT. Furthermore, the high accuracy type and high error-tolerant type improve inference accuracy by 4.4% and memory-error tolerance by more than four times compared with convolutional neural networks, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call