Abstract

Current signature-based antivirus software is ineffective against many modern malicious software threats. Machine learning methods can be used to create more effective antimalware software, capable of detecting even zero-day attacks. Some studies have investigated the plausibility of applying machine learning to malware detection, primarily using features from n-grams of an executables file's byte code. We propose an approach that primarily learns from metadata, mostly contained in the headers of executable files, specifically the Windows Portable Executable 32-bit (PE32) file format. Our experiments indicate that executable file metadata is highly discriminative between malware and benign software. We also employ various machine learning methods, finding that Decision Tree classifiers outperform Logistic Regression and Naive Bayes in this setting. We analyze various features of the PE32 header and identify those most suitable for machine learning classifiers. Finally, we evaluate changes in classifier performance when the malware prevalence (fraction of malware versus benign software) is varied.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.