Context-aware code generation with synchronous bidirectional decoder

Yu Zhou,Guang Yang,Taolue Chen,Xiangyu Zhang,Tingting Han

doi:10.1016/j.jss.2024.112066

Abstract

Code generation aims to map natural language descriptions to code snippets. Recent approaches using sequence-to-tree models have shown promising results. However, they generally adopt an autoregressive way to predict the next token based on previous ones and do not consider potential future tokens. To address this issue, we propose Contextor, a novel context-sensitive model employing a bidirectional decoder to generate tokens in two different orders synchronously and interactively. Specifically, we employ two decoders to generate two sequences of different traversals and share their context knowledge via the attention mechanism. As a result, our model can synthesize both previous and future information simultaneously. To alleviate the information leakage problem caused by the teacher-forcing training strategy and bidirectional decoding, we propose an adapted scheduled sampling technique to prevent the decoders from contacting the actual label. Furthermore, Contextor also features a bidirectional beam search algorithm to better interact with both decoders. Experimental results demonstrate that our approach outperforms the state-of-the-art baselines.

Full Text