Abstract

During software development, numerous third-party library functions are often reused. Accurately recognizing library functions reused in software is of great significance for some security scenarios, such as the detection of known vulnerabilities and reverse analyses of malware. An optional method for recognizing library functions is matching the functions in the library to those in the target software. However, due to the diversity of function library versions, compilers, build options, etc., there are differences between the two corresponding functions. Recognizing library functions used in target software precisely is still a challenging task. In this paper, we propose a novel method named SELF (SEarch for Library Functions) to recognize library functions used in target software. In SELF, the function is represented with a co-occurrence matrix and encoded by a convolutional auto-encoder (CAE). Then, the similarity between two functions is detected using the generated bottleneck features. This scheme focuses on the discriminative semantic features; thus, this method can not only distinguish different functions but also tolerate the subtle differences between two pairing functions, which is specifically required for library function recognition. We collected 451 software projects, including approximately 3 million functions, to train and evaluate SELF. The experimental results show that SELF performs well in both Recall@1 and Recall@5. Especially when the library version gap is large, SELF significantly outperforms classic BINDIFF. In addition, SELF shows good computational efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.