Abstract

Identifying the credibility of executable files is critical for the security of an operating system. Modern operating systems rely on code signing, which uses a default-valid trust model, for executable files to identify their publishers. A malware could pass software validation of operating systems and security software by using counterfeit code-signing certificates. Although the counterfeit certificates can be revoked by CAs, the previous research showed that the revocation delay takes as long as 5.6 months. In this paper, we attempt to identify the credibility of software with multiple-version executable files without relying on public key infrastructure (PKI), where a new-version executable file is usually developed incrementally based on the previous versions. The sharing features among different versions can be extracted for identifying the software. Accordingly, we present a software-birthmark scheme to serve our purpose. Our scheme generates a cross-version software birthmark for executable files of the same software. The proposed software birthmark is a binary-classification model of a machine learning algorithm based on imported and exported function names extracted from different-version executable files. To evaluate the performance of version-wide software birthmarks, our experiments include 138 versions of Windows kernel32.dll and 545 versions of firefox.exe . We also use multiple machine learning algorithms for performance comparisons. The results show that proposed software birthmark can effectively identify the derivations of these executable files. The proposed software birthmark can be used by operating systems or security software to evaluate the credibility of executable files with suspicious certificates.

Highlights

  • Code signing [1] is a process of digitally signing an executable file to confirm the software publishers

  • We describe the procedures for extracting feature strings from IAT and EAT as well as transforming these feature strings into local feature vectors for machine learning algorithms

  • Unlike the previous algorithms of software birthmarks designed for detecting software theft and piracy, our scheme generates one birthmark based on the different-version executable files of a program by using machine learning algorithms

Read more

Summary

INTRODUCTION

Code signing [1] is a process of digitally signing an executable file to confirm the software publishers. The proposed scheme of version-wide software birthmark (VWSB) could identify the credibility of an executable file without relying on PKI. We present the first scheme of generating cross-version software birthmarks by using machine learning algorithms. The proposed software birthmark is a binaryclassification model for identify whether an executable file is a different-version PE file of the same program without relying on PKI. We develop a procedure for extracting commonly available features from PE files These features are input into the machine learning algorithms for training to generate a binary-classification model. The experimental results show that the models generated by several machine learning algorithms can identify cross-version PE files with high accuracy.

CODE SIGNING
BINARY CODE SIMILARITY
IMPLEMENTATION OF VWSB
CALCULATION OF FILE CHARACTERISTICS
CLASSIFICATION BASED ON VWSB
COMPARISONS WITH CODE SIGNING
EXPERIMENTS
STATISTICS OF EXTRACTED FEATURES
PERFORMANCE OF VWSB
Findings
CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.