Abstract

Transformer, a deep learning technology, has achieved great success in various applications such as natural language processing. However, its space–time consumption remains a concern, leading to the issue of AI technology monopoly. Therefore, downsizing this technology is critical to mitigating the adverse effects of technological monopoly on innovation and competition. This article proposes a revised version of the Performer, an efficient variant of Transformer. Specifically, the proposed revision aims to resolve three issues associated with Performer. First, generating a target token necessitates padding tokens that require O(L) space and time for sequences of maximum length L. Second, Performer redundantly calculate attentions between previously generated tokens. Third, inconsistencies arise between the training and inference phases due to normalizer calculation in the stable FAVOR+ masked attention operation. The proposed revision introduces a cached version of the FAVOR+ operation, resulting in fast text generation with O(L) time and O(1) space complexity. To examine the effectiveness of the proposed revision, a Performer-based encoder/decoder model that generates text feedback for code correction in object-oriented programming education is developed and evaluated. The results show that the revised Performer achieves high accuracy and a more than 120-fold increase in inference speed. In addition, user evaluation shows that the feedback generated by this model is more beneficial for programming education than that generated by ChatGPT.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call