Abstract

We introduce SemDiff, a novel technology for finding semantic differences between two binary files. Now, the vendor will release the information to patch the previous version which has vulnerability. Then, we can compare the differences and similarities between the two versions to get the unpublished details of the 1day vulnerabilities. Tools, such as BinDiff, BinHunt and iBinHunt, have worked on this project before, however, there are some weaknesses on them. Just like BinDiff, a comparison method based on structure, can not be effective for judging the semantic differences. Though the other two tools(BindHunt and iBinHunt) can recognize the differences we focus on, they can not effectively verify the functional inlining and spend a pretty long time to finish the process because the use of graph-based isomorphism algorithm. In the paper, we first propose SemDiff, which uses the existing tool(angr) to generate the intermediate language(VEX). Then, because of the nature of program, the data read from and written to the memories, we record these information to implement the comparison. Last, an improved BinDiff algorithm is used to match the basic blocks. In this paper, we take some real vulnerabilities as examples, such as CVE-2010-3974-Microsoft Windows to test our tool, reaching a good goal, matching more blocks than BinDiff and taking less time than BinHunt and iBinHunt.

Highlights

  • For the purpose to protect the source code, many software vendors make the source code of their programs unavailable and when the vulnerabilities occurs, the patch is released in binary mode, rather than the source code

  • We propose a new method called SemDiff to find the semantic differences between the two programs

  • Our method is based on the control flow on basic blocks, symbolic execution[8] and the theorem prover

Read more

Summary

Introduction

For the purpose to protect the source code, many software vendors make the source code of their programs unavailable and when the vulnerabilities occurs, the patch is released in binary mode, rather than the source code. As Microsoft and other companies, when they publish a patch, no details are showed [1] This situation increases the difficulty in analyzing the potential vulnerabilities to protect us from those threats, which may hijack our data, steal our privacy and so on. Our method is based on the control flow on basic blocks, symbolic execution[8] and the theorem prover. We record the data written to and read from memories and registers, and we put them into theorem prover to judge the similarity. Last, we use these information and the SemDiff to match the blocks.

System Architecture
SemDiff Algorithm
Symbolic Execution and Theorem Proving
Experiment
Summary
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call