Abstract
Black-box machine learning models are used in an increasing number of high-stakes domains, and this creates a growing need for Explainable AI (XAI). However, the use of XAI in machine learning introduces privacy risks, which currently remain largely unnoticed. Therefore, we explore the possibility of an explanation linkage attack , which can occur when deploying instance-based strategies to find counterfactual explanations. To counter such an attack, we propose k -anonymous counterfactual explanations and introduce pureness as a metric to evaluate the validity of these k -anonymous counterfactual explanations. Our results show that making the explanations, rather than the whole dataset, k -anonymous, is beneficial for the quality of the explanations.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ACM Transactions on Intelligent Systems and Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.