Embedding and extraction of knowledge in tree ensemble classifiers

Wei Huang,Xiaowei Huang,Xingyu Zhao

doi:10.1007/s10994-021-06068-6

Wei Huang, Xiaowei Huang + Show 1 more

Open Access

https://doi.org/10.1007/s10994-021-06068-6

Copy DOI

Journal: Machine Learning	Publication Date: Nov 24, 2021
Citations: 5	License type: open-access

Affiliation: University of Liverpool

Abstract

The embedding and extraction of knowledge is a recent trend in machine learning applications, e.g., to supplement training datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., point-wise robustness and backdoor attacks. For the embedding, it is required to be preservative (the original performance of the classifier is preserved), verifiable (the knowledge can be attested), and stealthy (the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective embedding algorithms, one of which is for black-box settings and the other for white-box settings. The embedding can be done in PTIME. Beyond the embedding, we develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver. While this novel algorithm can successfully extract knowledge, the reduction leads to an NP computation. Therefore, if applying embedding as backdoor attacks and extraction as defence, our results suggest a complexity gap (P vs. NP) between the attack and defence when working with tree ensemble classifiers. We apply our algorithms to a diverse set of datasets to validate our conclusion extensively.

Highlights

While a trained tree ensemble may provide an accurate solution, its learning algorithm, such as Ho (1998), does not support a direct embedding of knowledge
In security-critical applications using tree ensemble classifiers, we are concerned about the backdoor attack and defence which can be expressed as the embedding and extraction of malicious backdoor knowledge, respectively
We evaluate our algorithms against the three success criteria on several popular benchmark datasets from UCI Machine Learning Repository (Asuncion and Newman 2007),LIBSVM (Chang and Lin 2011) and the Microsoft Malware Prediction (MMP) dataset

Summary

Introduction

While a trained tree ensemble may provide an accurate solution, its learning algorithm, such as Ho (1998), does not support a direct embedding of knowledge. In security-critical applications using tree ensemble classifiers, we are concerned about the backdoor attack and defence which can be expressed as the embedding and extraction of malicious backdoor knowledge, respectively. Previous research (Bachl et al 2019) shows that backdoor knowledge embedded to the RF classifiers for IDSs can make the intrusion detection bypassed. Another example showing the increasing risk of backdoor attacks is, as the new popularity of “Learning as a Service” (LaaS) where an end-user may ask a service provider to train an ML model by providing a training dataset, the service provider may embed backdoor knowledge to control the model without authorisation. The defender may pursue a better understanding of the backdoor knowledge, and wonder if the backdoor knowledge can be extracted from the tree ensemble

Objectives

Results

Conclusion