FeSA: Feature selection architecture for ransomware detection under concept drift

Damien Warren Fernando,Nikos Komninos

doi:10.1016/j.cose.2022.102659

Abstract

This paper investigates how different genetic and nature-inspired feature selection algorithms operate in systems where the prediction model changes over time in unforeseen ways. As a result, this study proposes a feature section architecture, namely FeSA, independent of the underlying classification algorithm and aims to find a set of features that will improve the longevity of the machine learning classifier. The feature set produced by FeSA is evaluated by creating scenarios in which concept drift is presented to our trained model. Based on our results, the generated feature set remains robust and maintains high detection rates of ransomware malware. Throughout this paper, we will refer to the true-positive rate of ransomware as detection; this is to clearly define what we focus on, as the high true positive rate for ransomware is the main priority. Our architecture is compared to other nature-inspired feature selection algorithms such as evolutionary search, genetic search, harmony search, best-first search and the greedy stepwise feature selection algorithm. Our results show that FeSA displays the least degradation on average when exposed to concept drift. FeSA is evaluated based on ransomware detection rate, recall, false positives and precision. The FeSA architecture provides a feature set that shows competitive recall, false positives and precision under concept drift while maintaining the highest detection rate from the algorithms it has been compared to.

Full Text