Abstract

Although the packing of executable binaries can be adopted with legitimate intent such as intellectual property protection and size reduction, malware developers utilize those tools to obfuscate their code and thus increase the complexity of static analysis. In order to recognize packed executables, the BinStat application was proposed. It is based on two major steps: the feature extraction, which involves the calculation of statistics and information theory properties from a given binary; and the classification, which adopts a decision tree learned from input features of packed and unpacked binaries previously known in order to classify new executables. The results obtained proved the effectiveness of the tool, but the choice of using only one classifier is arguably a weakness that we chose to improve on the present study. For that end, we rebuilt the training and test datasets and selected the following six classifiers to our analyses: classification and regression trees, random forest, k-nearest neighbors, naive Bayes, neural network and support vector machines. Our results show that the original decision tree algorithm adopted in BinStat (C5.0) is not the best choice for the proposed problem. Indeed, random forest, k-nearest neighbors and support vector machines achieved the best predictive performances.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.