XorSHAP: Privacy-Preserving Explainable AI for Decision Tree Models

Dimitar Jetchev,Marius Vuille

doi:10.62056/a3qjmp-3y

Dimitar Jetchev, Marius Vuille

https://doi.org/10.62056/a3qjmp-3y

Copy DOI

Export

Save

Cite

Journal: IACR Communications in Cryptology	Publication Date: Jan 13, 2025
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

Explainable AI (XAI) refers to the development of AI systems and machine learning models in a way that humans can understand, interpret and trust the predictions, decisions and outputs of these models. A common approach to explainability is feature importance, that is, determining which input features of the model have the most significant impact on the model prediction. Two major techniques for computing feature importance are LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations). While very generic, these methods are computationally expensive even when the data is not encrypted. Applying them in the privacy-preserving setting when part or all of the input data is private is therefore a major computational challenge. In this paper, we present XorSHAP - the first practical data-oblivious algorithm for computing SHAP values for decision tree ensemble models. The algorithm is applicable in various privacy-preserving settings such as SMPC, FHE and differential privacy. Our algorithm has complexity O ( T M ~ D 2 D ) , where T is the number of decision trees in the ensemble, D is the depth of the decision trees and M ~ is the maximum of the number of features M and 2 D (the number of leaf nodes of a tree), and scales to real-world datasets. We implement the algorithm in the semi-honest Secure Multiparty Computation (SMPC) setting with full threshold using Inpher's Manticore framework. Our implementation simultaneously computes the SHAP values for 100 samples for an ensemble of T = 60 trees of depth D = 4 and M = 100 features in just 7.5 minutes, meaning that the SHAP values for a single prediction are computed in just 4.5 seconds for the same decision tree ensemble model. Additionally, it is parallelization-friendly, thus, enabling future work on massive hardware acceleration with GPUs.

Full Text