Abstract
The proper interpretation of the malware API call sequence plays a crucial role in identifying its malicious intent. Moreover, there is a necessity to characterize smart malware mimicry activities that resemble goodware programs. Those types of malware imply further challenges in recognizing their malicious activities. In this paper, we propose a standard and straightforward contextual behavioral models that characterize Windows malware and goodware. We relied on the word embedding to realize the contextual association that may occur between API functions in malware sequences. Our empirical results proved that there is a considerable distinction between malware and goodware call sequences. Based on that distinction, we propose a new method to detect malware that relies on the Markov chain. We also propose a heuristic method that identifies malware’s mimicry activities by tracking the likelihood behavior of a given API call sequence. Experimental results showed that our proposed model outperforms other peer models that rely on API call sequences. Our model returns an average malware detection accuracy of 0.990, with a false positive rate of 0.010. Regarding malware mimicry, our model shows an average noteworthy accuracy of 0.993 in detecting false positives.
Highlights
With the rapid development in computers and Internet technology, malicious programs have significantly developed in both categories and quantities
Throughout this paper, we proposed a malware detection mechanism relying on the contextual perception among Application Programming Interface (API) within the calling sequence
We show that our model could efficiently recognize whether a sequence of API calls leads to malicious activities or not
Summary
With the rapid development in computers and Internet technology, malicious programs (malware) have significantly developed in both categories and quantities. Researchers have centered their attention on inventing diversity malware detection methods to relieve the expeditiously growing malware rate. Malware detection methods are categorized into either static or dynamic [1]. Researchers usually check and analyze portable executable (PE) files’. Throughout the static analysis, analyzers investigated PE files by collecting and extracting specific features such as string patterns, operation code (op-code) sequences, and byte sequences. The features collected during static analysis are generally viewed as discriminating features that are used to decide whether a given sample is malicious or not [2]. Static malware detection methods have shown to be inappropriate to overcome the skillful techniques used by malware authors to bypass detection [3,4,5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.