Abstract

Crypto ransomware attacks have substantially increased in recent years, and owing to their highly profitable  nature, this growth will evidently escalate in the future. To better understand this malware and help developers of ransomware detection systems build more robust and reliable solutions, this study investigates ransomware actions during the destruction phase through behavioral feature analysis. We used a dataset with 1524 samples and 30 967 features representing the actions conducted using 582 types of ransomware and 942 good applications (goodware). Six representative and widely used classification algorithms were applied as auxiliary tools to investigate the behavior of these attacks: Naive Bayes (NB), K-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Stochastic Gradient Descent (SGD), and Support Vector Machine (SVM). We achieved an accuracy of 98.48%, balanced accuracy of 98.35%, precision of 98.17%, recall of 97.82%, F-measure of 97.98%, and ROC AUC of 99.87% by using RF for 462 features of the resultant dataset. We propose a new criterion to determine the feature group relevance and a method to distinguish the features that are most related to ransomware and goodware. Our main conclusions are as follows: Application Programming Interface (API) calls are the most relevant feature group, achieving alone a balanced accuracy of 96.49%; native encryption Windows APIs are not crucial for ransomware classification; and the most significant features of ransomware tend to involve handling the thread/process, physical memory operation, and communication, whereas goodware features are more likely to indicate virtual memory, files, directories, and resource operations.

Highlights

  • D URING the COVID-19 pandemic, because working from home has required workers to connect remotely to corporate networks, attacks have significantly increased against several companies and government institutions, making this type of threat one of the 10 worst in the cybersecurity field [1], [2]

  • We formulated and answered four questions: 1) Should all groups of features be considered, 2) what changes from ransomware to goodware, 3) what are the most relevant Application Programming Interface (API), and 4) where are the encryption APIs?

  • The Random Forest (RF) algorithm outperformed all evaluation metrics, with an accuracy of 98.48%, balanced accuracy of 98.35%, precision of 98.17%, recall of 97.82%, F-measure of 97.98%, and Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) of 99.87%

Read more

Summary

Introduction

D URING the COVID-19 pandemic, because working from home has required workers to connect remotely to corporate networks, attacks have significantly increased against several companies and government institutions, making this type of threat one of the 10 worst in the cybersecurity field [1], [2]. The increase in the number of such attacks has led users, organizations, and governments to protect and create backups of critical data. Because of their highly profitable nature, ransomware attacks are constantly evolving to bypass current protection mechanisms and improve the encryption process [3]. Ransomware can be divided into two groups: Locky and Crypto. These two groups have different characteristics in terms of malicious actions. Whereas Locky ransomware typically blocks the victim’s system by displaying a login page

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.