A Framework for Accelerating Transformer-Based Language Model on ReRAM-Based Architecture

Myeonggu Kang,Lee-Sup Kim,Hyein Shin

doi:10.1109/tcad.2021.3121264

Abstract

Transformer-based language models have become the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">de-facto</i> standard model for various natural language processing (NLP) applications given the superior algorithmic performances. Processing a transformer-based language model on a conventional accelerator induces the memory wall problem, and the ReRAM-based accelerator is a promising solution to this problem. However, due to the characteristics of the self-attention mechanism and the ReRAM-based accelerator, the pipeline hazard arises when processing the transformer-based language model on the ReRAM-based accelerator. This hazard issue greatly increases the overall execution time. In this article, we propose a framework to resolve the hazard issue. First, we propose the concept of window self-attention to reduce the attention computation scope by analyzing the properties of the self-attention mechanism. After that, we present a window-size search algorithm, which finds an optimal window size set according to the target application/algorithmic performance. We also suggest a hardware design that exploits the advantages of the proposed algorithm optimization on the general ReRAM-based accelerator. The proposed work successfully alleviates the hazard issue while maintaining the algorithmic performance, leading to a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5.8\times $ </tex-math></inline-formula> speedup over the provisioned baseline. It also delivers up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$39.2\times /643.2\times $ </tex-math></inline-formula> speedup/higher energy efficiency over GPU, respectively.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Framework for Accelerating Transformer-Based Language Model on ReRAM-Based Architecture

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Sep 1, 2022
Citations: 5

Similar Papers

Adapting transformer-based language models for heart disease detection and risk factors extraction
Rehab E Mohamed ... Gang Hu
Journal of Big Data | VOL. 11
Rehab E Mohamed, et. al.Rehab E Mohamed ... Gang Hu
04 Apr 2024
Journal of Big Data | VOL. 11

Application of Transformer-Based Language Models to Detect Hate Speech in Social Media
... Sujit Das
Journal of Computational and Cognitive Engineering | VOL. 2
, et. al. ... Sujit Das
17 Dec 2021
Journal of Computational and Cognitive Engineering | VOL. 2

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers
Frederico Dias Souza ... Filho
Neural Computing and Applications | VOL. 35
Frederico Dias Souza, et. al.Frederico Dias Souza ... Filho
01 Dec 2022
Neural Computing and Applications | VOL. 35

How Is a “Kitchen Chair” like a “Farm Horse”? Exploring the Representation of Noun-Noun Compound Semantics in Transformer-based Language Models
Jesús Martínez Del Rincón ... Barry Devereux
Computational Linguistics | VOL. 50
Jesús Martínez Del Rincón, et. al.Jesús Martínez Del Rincón ... Barry Devereux
01 Mar 2024
Computational Linguistics | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Framework for Accelerating Transformer-Based Language Model on ReRAM-Based Architecture

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems