Forward-porting and its limitations in fuzzer evaluation

Haroon Elahi,Guojun Wang

doi:10.1016/j.ins.2024.120142

Abstract

Forward-porting reintroduces previously detected and patched software bugs from older versions into later ones to create benchmarking workloads for fuzzing. These benchmarks gauge a fuzzer's performance by testing its ability to detect or trigger these bugs during a fuzzing campaign. In this study, we evaluate the reliability of forward porting in establishing dependable fuzzing benchmarks and their suitability for fair and accurate fuzzer evaluation. We utilize online resources, forward porting, fuzzing experiments, and triaging to scrutinize the workloads of a state-of-the-art fuzzing benchmark. We uncover seven factors, including software architecture changes, misconfigurations, supply chain issues, and developer errors, all of which compromise the success of forward porting. We determine that the ‘ground truth’ established through forward porting is only occasionally ‘true’ due to unaccounted-for underlying bugs in all examined software applications undergoing this process. These findings question the reliability of forward porting in generating dependable fuzzing benchmarks. Furthermore, our experimental results suggest that relying on forward porting-based ground truth and verification metrics could lead to misleading evaluations of fuzzer performance. Ultimately, we propose insights into the development of fuzzing benchmarks to ensure more dependable assessments of fuzzers.

Full Text