Abstract
Transformer model is a type of deep learning model that has quickly become fundamental in natural language processing (NLP) and other machine learning tasks. Transformer hardware accelerators are usually designed for specific models, such as Bidirectional Encoder Representations from Transformers (BERT), and vision Transformer models, like the ViT. In this study, we propose a Scalable Transformer Accelerator Unit (STAU) for multiple models, enabling efficient handling of various Transformer models used in voice assistant applications. Variable Systolic Array (VSA) centralized design, along with control and data preprocessing in embedded processors, enables matrix operations of varying sizes. In addition, we propose an efficient variable structure and a row-wise data input method for natural language processing where the word count changes. The proposed scalable Transformer accelerator accelerates text summarization, audio processing, image search, and generative AI used in voice assistance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.