Abstract

Many existing techniques for reversing data structures in C/C ++ binaries are limited to low-level programming constructs, such as individual variables or structs. Unfortunately, without detailed information about a program's pointer structures, forensics and reverse engineering are exceedingly hard. To fill this gap, we propose MemPick, a tool that detects and classifies high-level data structures used in stripped binaries. By analyzing how links between memory objects evolve throughout the program execution, it distinguishes between many commonly used data structures, such as singly- or doubly-linked lists, many types of trees (e.g., AVL, red-black trees, B-trees), and graphs. We evaluate the technique on 10 real world applications, 4 file system implementations and 16 popular libraries. The results show that MemPick can identify the data structures with high accuracy.

Highlights

  • Modern software typically revolves around its data structures

  • We describe MemPick: a set of techniques to detect and classify heap data structures used by a C/C++ binary

  • Still our results show that the overlay based classifier is resilient to unexpected data structure shapes, by correctly classifying all basic overlays contained within

Read more

Summary

Introduction

Modern software typically revolves around its data structures. Knowing the data structures significantly eases the reverse engineering efforts. Not knowing the data structures makes the already difficult task of understanding the program’s code and data even harder. A deep knowledge of the program’s data structures enables new kinds of binary optimization. An optimizer may keep the nodes of a tree on a small number of pages (to reduce page faults and TLB flushes).

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call