Abstract
Control Flow Graphs (CFGs) provide fundamental data for many program analyses, such as malware analysis, vulnerability detection, code similarity analysis, etc. Existing techniques for constructing control flow graphs include static, dynamic, and hybrid analysis, which each having their own advantages and disadvantages. However, due to the difficulty of resolving indirect jump relations, the existing techniques are limited in completeness. In this paper, we propose a practical technique that applies static analysis and dynamic analysis to construct more complete control flow graphs. The main innovation of our approach is to adopt directed gray-box fuzzing (DGF) instead of coverage-based gray-box fuzzing (CGF) used in the existing approach to generate test cases that can exercise indirect jumps. We first employ a static analysis to construct the static CFGs without indirect jump relations. Then, we utilize directed gray-box fuzzing to generate test cases and resolve indirect jump relations by monitoring the execution traces of these test cases. Finally, we combine the static CFGs with indirect jump relations to construct more complete CFGs. In addition, we also propose an iterative feedback mechanism to further improve the completeness of CFGs. We have implemented our technique in a prototype and evaluated it through comparing with the existing approaches on eight benchmarks. The results show that our prototype can resolve more indirect jump relations and construct more complete CFGs than existing approaches.
Highlights
A control flow graph (CFG) represents all paths of a program that might be traversed during execution and is a fundamental data structure in program analysis
Aiming at the undirectedness of coverage-based gray-box fuzzing (CGF) that is used in the test case generation of existing CFG construction approach, we propose employing the distance-based directed gray-box fuzzing (DGF) technique instead of CGF to generate test cases
It is identical to our prototype DGF-CFGConstructor, except that the iterative feedback mechanism is removed
Summary
A control flow graph (CFG) represents all paths of a program that might be traversed during execution and is a fundamental data structure in program analysis. In a CFG, nodes represent basic blocks of instructions and directed edges represent jumps in the control flow. The CFG lays foundation for many other program analysis techniques, such as data flow analysis [1,2], taint analysis [3,4], and symbolic execution [5,6,7]. Utilizing appropriate approaches to construct complete and precise CFG is necessary. Indirect jumps bring challenges to constructing complete CFGs [19].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.