In contemporary software security testing, fuzzing is a pervasive methodology employed to identify vulnerabilities. However, one of the most significant challenges is the vast number of crash reports, many of which are repetitive, resulting in an increased analysis burden for security researchers. To address this issue, we propose a novel method for reducing crash redundancy and grouping similar crashes based on their execution traces. By leveraging the Intel Processor Trace (PT), we can reconstruct the instruction flow of the last executed function in each crash and extract its relevant instruction slice through data dependency backward slicing. The registers are abstracted, and the immediate values are generalized to normalize the instruction sequence. Subsequently, fuzzy hashing is applied to the generalized instruction sequences, and a similarity-based greedy strategy is employed for grouping. The method effectively reduces the workload by clustering crashes with similar root causes, leaving analysts with only representative samples to investigate. Furthermore, compared with conventional stack hashing techniques, our methodology demonstrates an average improvement in accuracy of 15.38% across four programs, with a total of 281 crashes.
Read full abstract