An Empirical Study on the Suitability of Test-based Patch Acceptance Criteria

Luciano Zemin,Simón Gutiérrez Brida,Ariel Godio,César Cornejo,Renzo Degiovanni,Germán Regis,Nazareno Aguirre,Marcelo Frias

doi:10.1145/3702971

Abstract

In this article, we empirically study the suitability of tests as acceptance criteria for automated program fixes, by checking patches produced by automated repair tools using a bug-finding tool, as opposed to previous works that used tests or manual inspections. We develop a number of experiments in which faulty programs from IntroClass , a known benchmark for program repair techniques, are fed to the program repair tools GenProg, Angelix, AutoFix and Nopol, using test suites of varying quality, including those accompanying the benchmark. We then check the produced patches against formal specifications using a bug-finding tool. Our results show that, in the studied scenarios, automated program repair tools are significantly more likely to accept a spurious program fix than producing an actual one. Using bounded-exhaustive suites larger than the originally given ones (with about 100 and 1,000 tests) we verify that overfitting is reduced but a) few new correct repairs are generated and b) some tools see their performance reduced by the larger suites and fewer correct repairs are produced. Finally, by comparing with previous work, we show that overfitting is underestimated in semantics-based tools and that patches not discarded using held-out tests may be discarded using a bug-finding tool.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Empirical Study on the Suitability of Test-based Patch Acceptance Criteria

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Software Engineering and Methodology

Lead the way for us

Similar Papers

An analysis of the suitability of test-based patch acceptance criteria
...
-
, et. al. ...
20 May 2017
20 May 2017

An Empirical Study of the Performance of Code Similarity in Automatic Program Repair Tool
Xingyu Zheng ... Yaoshen Yu
-
Xingyu Zheng, et. al.Xingyu Zheng ... Yaoshen Yu
17 Sep 2022
17 Sep 2022

APRSuite: A suite of components and use cases based on categorical decomposition of automatic program repair techniques and tools
Alireza Khalilian ... Ahmad Baraani-Dastjerdi
Journal of Computer Languages | VOL. 57
Alireza Khalilian, et. al.Alireza Khalilian ... Ahmad Baraani-Dastjerdi
06 Nov 2019
Journal of Computer Languages | VOL. 57

The Inversive Relationship Between Bugs and Patches: An Empirical Study
Jinhan Kim ... Jongchan Park
-
Jinhan Kim, et. al.Jinhan Kim ... Jongchan Park
01 Apr 2023
01 Apr 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Empirical Study on the Suitability of Test-based Patch Acceptance Criteria

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Software Engineering and Methodology