Software Vulnerability Detection Using Backward Trace Analysis and Symbolic Execution

Hongzhe Li,Munkhbayar Bat-Erdene,Taebeom Kim,Heejo Lee

doi:10.1109/ares.2013.59

Abstract

Software vulnerability has long been considered an important threat to the safety of software systems. When source code is accessible, we can get much help from the information of source code to detect vulnerabilities. Static analysis has been used frequently to scan code for errors that cause security problems when source code is available. However, they often generate many false positives. Symbolic execution has also been proposed to detect vulnerabilities and has shown good performance in some researches. However, they are either ineffective in path exploration or could not scale well to large programs. During practical use, since most of paths are actually not related to security problems and software vulnerabilities are usually caused by the improper use of security-sensitive functions, the number of paths could be reduced by tracing sensitive data backwardly from security-sensitive functions so as to consider paths related to vulnerabilities only. What's more, in order to leave ourselves free from generating bug triggering test input, formal reasoning could be used by solving certain program conditions. In this research, we propose backward trace analysis and symbolic execution to detect vulnerabilities from source code. We first find out all the hot spot in source code file. Based on each hot spot, we construct a data flow tree so that we can get the possible execution traces. Afterwards, we do symbolic execution to generate program constraint(PC) and get security constraint(SC) from our predefined security requirements along each execution trace. A program constraint is a constraint imposed by program logic on program variables. A security constraint(SC) is a constraint on program variables that must be satisfied to ensure system security. Finally, this hot spot will be reported as a vulnerability if there is an assignment of values to program inputs which could satisfy PC but violates SC, in other words, satisfy PC Λ SC. We have implemented our approach and conducted experiments on test cases which we randomly choose from Juliet Test Suites provided by US National Security Agency(NSA). The results show that our approach achieves Precision value of 83.33%, Recall value of 90.90% and F1 Value of 86.95% which gains the best performance among competing tools. Moreover, our approach can efficiently mitigate path explosion problem in traditional symbolic execution.

Full Text