Abstract
Since the number of malware is increasing rapidly, it continuously poses a risk to the field of network security. Attention mechanism has made great progress in the field of natural language processing. At the same time, there are many research studies based on malicious code API, which is also like semantic information. It is a worthy study to apply attention mechanism to API semantics. In this paper, we firstly study the characters of the API execution sequence and classify them into 17 categories. Secondly, we propose a novel feature extraction method based on API execution sequence according to its semantics and structure information. Thirdly, based on the API data characteristics and attention mechanism features, we construct a detection framework SLAM based on local attention mechanism and sliding window method. Experiments show that our model achieves a better performance, which is a higher accuracy of 0.9723.
Highlights
We have noticed that, in the field of machine learning, the attention mechanism has been used very successfully, especially in the fields of Natural Language Processing (NLP), image, and machine Q and A
We construct semantic and structure-based feature sequences for Application Programming Interface (API) execution sequences. en, according to this feature sequence, we design a sliding local attention mechanism model SLAM for detecting malware. e experimental results show that our feature extraction method is very effective
(3) Propose a detection framework based on sliding local attention mechanism, which achieves a better performance in malware detection e remaining of the paper is organized as follows
Summary
We first analyze the attributes of the API and divide APIs into 17 categories based on its functionality and official definition. E attention mechanism is a deep learning model which is mainly used in computer vision and NLP. E attention mechanism can be described by the following formula: Attention(Q, K, V) F(Q, K)V. focuses on some key part from the massive input information. Because of the existence of context in NLP and the problem of out-of-order in sentence, it will greatly restrict the effectiveness of some deep learning model In response to this problem, XLNet uses a two-stream attention mechanism to extract key values from both a content and context perspective, thereby it significantly improves performance. Based on both the API and attention mechanism analysis, we will build our own feature extraction methods and build targeted detection framework. We select 310 API which are frequently used by the samples and divide them into 17 categories. en, based on the frequency of the category
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.