Even more so than natural language, code is extremely sensitive to syntax; a small error could make an entire snippet invalid. It is therefore important to explore methods for ensuring syntactic correctness in generated code. Existing methods to resolve this issue often rely on the complex architecture of syntax-guided decoders. In this work, we present the grammar enforcement method, which introduces a separate layer that constrains the decisions of the transformer during fine-tuning according to syntactic constructs present both in the target language grammar and the given training set. We experiment with the Hearthstone dataset to study its effects on result programs and compare it with the existing state-of-art syntax-guided decoders. We demonstrate a statistically significant positive effect of grammar enforcement on the quality of generated programs in terms of exact match accuracy and grammatically correct percent of samples. At the same time, we observe lower values for text-based metrics, chrF, and BLEU, potentially indicating their inability to represent the quality of generated abstract syntax sequences.
Read full abstract