Abstract

In recent years, a significant amount of research has focused on analyzing the effectiveness of machine learning (ML) models for malware detection. These approaches have ranged from methods such as decision trees and clustering to more complex approaches like support vector machine (SVM) and deep neural networks. In particular, neural networks have proven to be very effective in detecting complex and advanced malware. This, however, comes with a caveat. Neural networks are notoriously complex. Therefore, the decisions that they make are often just accepted without questioning why the model made that specific decision. The black box characteristic of neural networks has challenged researchers to explore methods to explain black-box models such as SVM and neural networks and their decision-making process. Transparency and explainability give the experts and malware analysts assurance and trustworthiness about the ML models’ decisions. In addition, it helps in generating comprehensive reports that can be used to enhance cyber threat intelligence sharing. As such, this much needed analysis drives our work in this paper to explore the explainability and interpretability of ML models in the field of online malware detection. In this paper, we used the Shapley Additive exPlanations (SHAP) explainability technique to achieve efficient performance in interpreting the outcome of different ML models such as SVM Linear, SVM-RBF (Radial Basis Function), Random Forest (RF), Feed-Forward Neural Net (FFNN), and Convolutional Neural Network (CNN) models trained on an online malware dataset. To explain the output of these models, explainability techniques such as KernalSHAP, TreeSHAP, and DeepSHAP are applied to the obtained results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.