Crex: Predicting patch correctness in automated repair of C programs through transfer learning of execution semantics

Dapeng Yan,Zhe Liu,Tegawendé F Bissyandé,Yuqing Niu,Jacques Klein,Zhiming Liu,Li Li,Kui Liu

doi:10.1016/j.infsof.2022.107043

Abstract

A significant body of automated program repair literature relies on test suites to assess the validity of generated patches. Because such oracles are weak, state-of-the-art repair tools can validate some patches that overfit the test cases but are actually incorrect. This situation has become a prime concern in APR, hindering its adoption by the industry. This work investigates execution semantic features based on micro-traces, a form of under-constrained dynamic traces. We build on transfer learning to explore function code representations that are amenable to semantic similarity computation and can therefore be leveraged for classifying patch correctness. Our Crex prototype implementation is based on the Trex framework. Experimental results on patches generated by the CoCoNut APR tool on CodeFlaws programs indicate that our approach can yield high accuracy in predicting patch correctness. The learned embeddings were proven to capture semantic similarities between functions, which was instrumental in training a classifier that identifies patch correctness by learning to discriminate between correctly patched code and incorrectly patched code based on their semantic similarity with the buggy function.

Full Text