Abstract

Domain adaptation techniques such as importance weighting modify the training data to better represent a different test data distribution, a process that may be particularly vulnerable to a malicious attack in an adversarial machine learning scenario. In this work, we explore the level of such vulnerability of importance weighting to poisoning attacks. Importance weighting, like other domain adaptation approaches, assumes that the distributions of training and test data are different but related. An intelligent adversary, having full or partial access to the training data, can take advantage of the expected difference between the distributions, and can inject well crafted malicious samples into the training data, resulting in an incorrect estimation of the importance ratio. In this work, we demonstrate the vulnerability of one of the simplest yet most effective approaches for directly estimating the importance ratio, namely, modifying the training distribution using a discriminative classifier such as the logistic regression. We test the robustness of the importance weighting process using well-controlled synthetic datasets, with an increasing number of attack points in the training data. Under the worst case perfect knowledge scenario, where the attacker has full access to the training data, we demonstrate that importance weighting can be dramatically compromised with the insertion of even a single attack point. We then show that even under limited knowledge scenario, where the attacker has limited access to the training data, the estimation process can still be significantly compromised.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call