Abstract

Conventional data mining theories developed for general-purpose applications commonly focus on the reducing the bias and variance on the ideal i.i.d. datasets, but neglecting its potential failure on maliciously generated data points by observing the system's behaviours. Therefore, dealing with these adversarial samples is an essential part of a security system to handle the data that are intentionally made to deceive the system. Due to this concern, this paper proposes a novel approach that introduces uncertainty to the model behaviour, in order to obfuscate the decision process of the attacking strategy and improve the robustness of security system against attacks that try to evade the detection. Our approach addresses three problems. First, we build a pool of mining models to improve robustness of a variety of mining algorithms, similar to ensemble learning but focusing on the optimisation the trade-off between off-line accuracy and robustness. Second, we randomly select a subset of models at run time (when the model is used for detection) to further boost the robustness. Third, we propose a theoretical framework that bounds the minimal number of features an attacker needs to modify given a set of selected models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call