Abstract

Attention-based transformer language models have shown significant performance gains in various natural language tasks. In this work, we explore the impact of transformer language models on the task of source code suggestion. The core intention of this work is to boost the modeling performance for the source code suggestion task and to explore how the training procedures and model architectures impact modeling performance. Additionally, we propose a transformer-based self-supervised learning technique called Transformer Gated Highway that outperforms recurrent and transformer language models of comparable size. The proposed approach combines the Transformer language model with Gated Highway introducing a notion of recurrence. We compare the performance of the proposed approach with transformer-based BERT (CodeTran), RoBERTa (RoBERTaCode), GPT2 (TravTrans), CodeGen and recurrent neural language-based LSTM (CodeLSTM) models. Moreover, we have experimented with various architectural settings for the transformer models to evaluate their impact on modeling performance. The extensive evaluation of the presented approach exhibits better performance on two programming language datasets; Java and C#. Additionally, we have adopted the presented approach for the syntax error correction task to predict the correct syntax token to render its possible implications for other source code modeling tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call