DivTheft: An Ensemble Model Stealing Attack by Divide-and-Conquer

Zhuo Ma,Yang Liu,Ximeng Liu,Xinjing Liu,Zhan Qin,Kui Ren

doi:10.1109/tdsc.2023.3234355

Abstract

Recently, model stealing attacks are widely studied but most of them are focused on stealing a single non-discrete model, e.g., neural networks. For ensemble models, these attacks are either non-executable or suffer from intolerant performance degradation due to the complex model structure (multiple sub-models) and the discreteness possessed by the sub-model (e.g., decision trees). To overcome the bottleneck, this paper proposes a divide-and-conquer strategy called DivTheft to formulate the model stealing attack to common ensemble models by combining active learning (AL). Specifically, based on the boosting learning concept, we divide a hard ensemble model stealing task into multiple simpler ones about single sub-model stealing. Then, we adopt AL to conquer the data-free sub-model stealing task. During the process, the current AL algorithm easily causes the stolen model to be biased because of ignoring the past useful memories. Thus, DivTheft involves a newly designed uncertainty sampling scheme to filter reusable samples from the previously used ones. Experiments show that compared with the prior work, DivTheft can save almost 50% queries while ensuring a competitive agreement rate to the victim model.

Full Text