Abstract

ABSTRACTThe traditional malware detection schemes based on specific signature give an unsatisfactory performance as disposing the previously unknown malware, so the general features of binary files should be explored to solve this problem. Recently, classification algorithms were employed successfully to choose the features in unknown malicious code, and most of the works use byte or operation code sequence n‐gram representation of the executables. However, these n‐gram representations are heavily dependent on the training data. In this paper, we present a graph‐based method to detect unknown malware. The function call graph of an executable, which includes the functions and the call relations between them, is selected as the representation of the executable in this method. The features are defined according to both the statistical information and the topology of the function call graph. They are extracted and processed through machine learning to classify unknown Portable Executable files. For the sake of fixed sum of the features, the graph‐based method can avoid so many features found in other methods. In our experiments, three types of malware datasets were tested, and as high as 96.8% accuracy can be achieved. Furthermore, it can achieve 92.1% accuracy when only 5% of the dataset is served as training set. Copyright © 2012 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.