Quality of Automated Program Repair on Real-World Defects

Manish Motwani,Claire Le Goues,Yuriy Brun,Rene Just,Mauricio Soto

doi:10.1109/tse.2020.2998785

Abstract

Automated program repair is a promising approach to reducing the costs of manual debugging and increasing software quality. However, recent studies have shown that automated program repair techniques can be prone to producing patches of low quality, overfitting to the set of tests provided to the repair technique, and failing to generalize to the intended specification. This paper rigorously explores this phenomenon on real-world Java programs, analyzing the effectiveness of four well-known repair techniques, GenProg, Par, SimFix, and TrpAutoRepair, on defects made by the projects’ developers during their regular development process. We find that: (1) When applied to real-world Java code, automated program repair techniques produce patches for between 10.6 and 19.0 percent of the defects, which is less frequent than when applied to C code. (2) The produced patches often overfit to the provided test suite, with only between 13.8 and 46.1 percent of the patches passing an independent set of tests. (3) Test suite size has an extremely small but significant effect on the quality of the patches, with larger test suites producing higher-quality patches, though, surprisingly, higher-coverage test suites correlate with lower-quality patches. (4) The number of tests that a buggy program fails has a small but statistically significant positive effect on the quality of the produced patches. (5) Test suite provenance, whether the test suite is written by a human or automatically generated, has a significant effect on the quality of the patches, with developer-written tests typically producing higher-quality patches. And (6) the patches exhibit insufficient diversity to improve quality through some method of combining multiple patches. We develop JaRFly, an open-source framework for implementing techniques for automatic search-based improvement of Java programs. Our study uses JaRFly to faithfully reimplement GenProg and TrpAutoRepair to work on Java code, and makes the first public release of an implementation of Par. Unlike prior work, our study carefully controls for confounding factors and produces a methodology, as well as a dataset of automatically-generated test suites, for objectively evaluating the quality of Java repair techniques on real-world defects.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Software Engineering	Publication Date: May 31, 2020
Citations: 30	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Quality of Automated Program Repair on Real-World Defects

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering

Lead the way for us

Similar Papers

JFIX: semantics-based repair of Java programs via symbolic PathFinder
Xuan-Bach D Le ... Claire Le Goues
-
Xuan-Bach D Le, et. al.Xuan-Bach D Le ... Claire Le Goues
10 Jul 2017
10 Jul 2017

ReFixar: Multi-version Reasoning for Automated Repair of Regression Errors
Xuan-Bach D Le ... Quang Loc Le
-
Xuan-Bach D Le, et. al.Xuan-Bach D Le ... Quang Loc Le
01 Oct 2021
01 Oct 2021

Towards efficient and effective automatic program repair
Xuan-Bach D Le
-
Xuan-Bach D LeXuan-Bach D Le
25 Aug 2016
25 Aug 2016

A manual inspection of Defects4J bugs and its implications for automatic program repair
Jiajun Jiang ... Xin Xia
Science China Information Sciences | VOL. 62
Jiajun Jiang, et. al.Jiajun Jiang ... Xin Xia
06 Sep 2019
Science China Information Sciences | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Quality of Automated Program Repair on Real-World Defects

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering