Abstract
Large pre-trained language models such as GPT-3 [10], Codex [11], and Google's language model [7] are now capable of generating code from natural language specifications of programmer intent. We view these developments with a mixture of optimism and caution. On the optimistic side, such large language models have the potential to improve productivity by providing an automated AI pair programmer for every programmer in the world. On the cautionary side, since these large language models do not understand program semantics, they offer no guarantees about quality of the suggested code. In this paper, we present an approach to augment these large language models with post-processing steps based on program analysis and synthesis techniques, that understand the syntax and semantics of programs. Further, we show that such techniques can make use of user feedback and improve with usage. We present our experiences from building and evaluating such a tool Jigsaw, targeted at synthesizing code for using Python Pandas API using multi-modal inputs. Our experience suggests that as these large language models evolve for synthesizing code from intent, Jigsaw has an important role to play in improving the accuracy of the systems.
Full Text
Topics from this Paper
Large Language Models
Large Pre-trained Language Models
Language Models
Mixture Of Optimism
Program Semantics
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Jan 1, 2022
Journal of biomedical informatics
Oct 1, 2023
INFORMS Journal on Data Science
May 19, 2023
Proceedings of the National Academy of Sciences
Mar 21, 2023
Jan 1, 2022
Nature Machine Intelligence
Mar 23, 2022
arXiv (Cornell University)
May 22, 2023
May 7, 2022
Research Ethics
Jun 15, 2023
Jan 1, 2022
May 11, 2022
Cognitive science
Nov 1, 2023
Aug 7, 2023
Jun 27, 2022