Abstract
Accelerating Visual Neural Networks in Edge Computing environments is crucial for processing image and video data. Visual Neural Networks, including Convolutional Neural Networks and Vision Transformers, are central to image recognition, video analysis, and object detection tasks. Deploying these networks on edge devices and accelerating them can significantly enhance data processing speed and efficiency. The large number of parameters, complex computational flows, and various structural variants of Transformer models present both opportunities and challenges. We propose Vis-TOP (Vision Transformer Overlay Processor), an overlay processor designed for all types of Vision Transformer models. Vis-TOP, distinct from coarse-grained general-purpose accelerators like GPUs and fine-grained custom designs, encapsulates Vision Transformer characteristics into a three-layer, two-level mapping structure, enabling flexible model switching without hardware architecture modifications. Concurrently, we designed a corresponding instruction bundle and hardware architecture within this mapping structure. We implemented the overlay processor design on the ZCU102 after quantizing the Swin Transformer model to 8-bit fixed points (fix_8). Experimentally, our throughput surpasses GPU implementation by 1.5 times. Our throughput per DSP is 2.2 to 11.7 times higher than that of existing Transformer-like accelerators. Overall, our approach satisfies real-time AI requirements in resource consumption and inference speed. Vis-TOP offers a cost-effective image processing solution for Edge Computing on reconfigurable devices, enhancing computational resource utilization, saving data transfer time and costs, and reducing latency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.