Design methodology of compact edge vision transformer CiM considering non-volatile memory bit precision and memory error tolerance

Naoko Misawa,Ryuhei Yamaguchi,Ayumu Yamada,Tao Wang,Chihiro Matsui,Ken Takeuchi

doi:10.35848/1347-4065/ad1bbd

Naoko Misawa, Ryuhei Yamaguchi + Show 4 more

Open Access

PDF Available

https://doi.org/10.35848/1347-4065/ad1bbd

Copy DOI

Export

Save

Cite

Journal: Japanese Journal of Applied Physics	Publication Date: Feb 8, 2024
License type: iop-standard

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This paper proposes a design methodology for a compact edge vision transformer (ViT) Computation-in-Memory (CiM). ViT has attracted much attention for its high inference accuracy. However, to achieve high inference accuracy, the conventional ViT requires fine-tuning many parameters with pre-trained models on large datasets and a large number of matrix multiplications in inference. Thus, to map ViT to non-volatile memory (NVM)-based CiM compactly for edge applications (IoT/Mobile devices) in inference, this paper analyses fine-tuning in training, clipping, and quantization in inference. The proposed compact edge ViT CiM can be optimized by three design methods according to use cases considering the required fine-tuning time, ease of setting memory bit precision, and memory error tolerance of ViT CiM. As a result, in CIFAR-10, the most compact type successfully reduces the total memory size of ViT by 85.8% compared with the conventional ViT. Furthermore, the high accuracy type and high error-tolerant type improve inference accuracy by 4.4% and memory-error tolerance by more than four times compared with convolutional neural networks, respectively.

Full Text