Code Completion Tools Research Articles

Large-scale generative models have enabled the development of AI-powered code completion tools to assist programmers in writing code. Like all AI-powered tools, these code completion tools are not always accurate and can introduce bugs or even security vulnerabilities into code if not properly detected and corrected by a human programmer. One technique that has been proposed and implemented to help programmers locate potential errors is to highlight uncertain tokens. However, little is known about the effectiveness of this technique. Through a mixed-methods study with 30 programmers, we compare three conditions: providing the AI system's code completion alone, highlighting tokens with the lowest likelihood of being generated by the underlying generative model, and highlighting tokens with the highest predicted likelihood of being edited by a programmer. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits, and is subjectively preferred by study participants. In contrast, highlighting tokens according to their probability of being generated does not provide any benefit over the baseline with no highlighting. We further explore the design space of how to convey uncertainty in AI-powered code completion tools and find that programmers prefer highlights that are granular, informative, interpretable, and not overwhelming. This work contributes to building an understanding of what uncertainty means for generative models and how to convey it effectively.

Read full abstract

Software developers frequently use code completion tools to accelerate software development by suggesting the following code elements. Researchers usually employ AutoRegressive (AR) decoders to complete code sequences in a left-to-right, token-by-token fashion. To improve the accuracy and efficiency of code completion, we argue that tokens within a code statement have the potential to be predicted concurrently. In this article, we first conduct an empirical study to analyze the dependency among the target tokens in line-level code completion. The results suggest that it is potentially practical to generate all statement tokens in parallel. To this end, we introduce SANAR, a simple and effective syntax-aware non-autoregressive model for line-level code completion. To further improve the quality of the generated code, we propose an adaptive and syntax-aware sampling strategy to boost the model’s performance. The experimental results obtained from two widely used datasets indicate that our model outperforms state-of-the-art code completion approaches of similar model size by a considerable margin, and is faster than these models with up to 9× speed-up. Moreover, the extensive results additionally demonstrate that the enhancements achieved by SANAR become even more pronounced with larger model sizes, highlighting their significance.

Read full abstract

Code Completion Tools Research Articles

Related Topics

Articles published on Code Completion Tools

Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions

Special Issue on Informatics Education: Exploring the Impact of GitHub Copilot on Health Informatics Education.

Non-Autoregressive Line-Level Code Completion

Promoting open science in test-driven software experiments

DeepVS: an efficient and generic approach for source code modelling usage

Improving code completion with program history

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Code Completion Tools Research Articles

Related Topics

Articles published on Code Completion Tools

Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions

Special Issue on Informatics Education: Exploring the Impact of GitHub Copilot on Health Informatics Education.

Non-Autoregressive Line-Level Code Completion

Promoting open science in test-driven software experiments

DeepVS: an efficient and generic approach for source code modelling usage

Improving code completion with program history