Abstract

AbstractDue to the high cost of repairing defective programs, many researches focus on automatic program repair (APR). In recent years, the new trend of APR is to apply neural networks to mine the relations between defective programs and corresponding patches automatically, which is known as neural program repair (NPR). The community, however, ignores some important properties that could impact the applicability of NPR systems, such as robustness. For semantic‐identical buggy programs, NPR systems may produce totally different patches. In this paper, we propose an evaluation tool named RobustNPR, the first NPR robustness evaluation tool. RobustNPR employs several mutators to generate semantic‐identical mutants of defective programs. For an original defective program and its mutant, it checks two aspects of NPR: (a) Can NPR fix mutants when it can fix the original defective program? and (b) can NPR generate semantic‐identical patches for the original program and the mutant? Then, we evaluate four SOTA NPR models and analyze the results. From the results, we find that even for the best‐performing model, 20.16% of the repair success is unreliable, which indicates that the robustness of NPR is not perfect. In addition, we find that the robustness of NPR is correlated with model settings and other factors.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.