A Novel Approach to Improve Robustness of Data Mining Models Used in Cyber Security Applications

Yingying Wang,Ning Cao

doi:10.1109/cse-euc.2017.240

Abstract

With the arrival of big data era, data mining techniques have been widely used to build models for cyber security applications such as spam filtering, malware or virus detection, and intrusion detection. This project proposes a novel approach that uses randomness to improve robustness of data mining models used in cyber security applications against attacks that try to evade detection by adapting. Our approach addresses three problems. First, we build a diverse pool of mining models to improve robustness of a variety of mining algorithms. These methods are similar to ensemble learning but optimize the tradeoff between mining quality and robustness. These methods also require very little modification to existing algorithms. Second, we randomly select a subset of models at run time (when the model is used for detection) to further boost robustness. Third, we propose a theoretical framework that bounds the minimal number of features an attacker needs to modify given a set of selected models.

Full Text