Abstract
Text generation from semantic parses is to generate textual descriptions for formal representation inputs such as logic forms and SQL queries. This is challenging due to two reasons: (1) the complex and intensive inner logic with the data scarcity constraint, (2) the lack of automatic evaluation metrics for logic consistency. To address these two challenges, this paper first proposes SNOWBALL, a framework for logic consistent text generation from semantic parses that employs an iterative training procedure by recursively augmenting the training set with quality control. Second, we propose a novel automatic metric, BLEC, for evaluating the logical consistency between the semantic parses and generated texts. The experimental results on two benchmark datasets, Logic2Text and Spider, demonstrate the SNOWBALL framework enhances the logic consistency on both BLEC and human evaluation. Furthermore, our statistical analysis reveals that BLEC is more logically consistent with human evaluation than general-purpose automatic metrics including BLEU, ROUGE and, BLEURT. Our data and code are available at this https URL.
Highlights
Natural language generation (NLG) from semantic parses is to generate the text description for the formal representation input such as logical forms, AMR, and SQL queries
2020) are not ideal for explicitly measuring the logic consistency (Wang et al, 2020b; Harkous et al, 2020), because they tend to evenly weight each word in the generated text without fully attending on the fatal logical keywords. To address these two critical problems, we propose the SNOWBALL framework for high-fidelity text generation from semantic parses and the BLEC automatic evaluation metric for logic consistency: Snowball Framework
To evaluate the logic consistency of the text generated by the model, we propose a rulebased automatic evaluation metric called Bidirectional Logic Evaluation of Consistency, or BLEC
Summary
Natural language generation (NLG) from semantic parses is to generate the text description for the formal representation input such as logical forms, AMR, and SQL queries. 2020) are not ideal for explicitly measuring the logic consistency (Wang et al, 2020b; Harkous et al, 2020), because they tend to evenly weight each word in the generated text without fully attending on the fatal logical keywords To address these two critical problems, we propose the SNOWBALL framework for high-fidelity text generation from semantic parses and the BLEC automatic evaluation metric for logic consistency: Snowball Framework. To deal with the data scarcity issue, we propose a data augmentation procedure to cover valid logic variations with diverse natural language expressions to improve generalizability To this end, during each iteration, various unseen logic pairs could be automatically generated with rule-based enumerated logic forms and their corresponding text predicted by the generator. Our statistical analysis reveals that BLEC achieves a +0.66 Pearson correlation coefficient compared with human labels, serving as a much better automatic evaluation metric than the traditional BLEU and ROUGE metrics, but the latest BLEURT metrics
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have