Understanding and Enhancing Robustness of Concept-Based Models

Sanchit Sinha,Jianhui Sun,Mengdi Huai,Aidong Zhang

doi:10.1609/aaai.v37i12.26765

Sanchit Sinha, Jianhui Sun + Show 2 more

Open Access

https://doi.org/10.1609/aaai.v37i12.26765

Copy DOI

Abstract

Rising usage of deep neural networks to perform decision making in critical applications like medical diagnosis and fi- nancial analysis have raised concerns regarding their reliability and trustworthiness. As automated systems become more mainstream, it is important their decisions be transparent, reliable and understandable by humans for better trust and confidence. To this effect, concept-based models such as Concept Bottleneck Models (CBMs) and Self-Explaining Neural Networks (SENN) have been proposed which constrain the latent space of a model to represent high level concepts easily understood by domain experts in the field. Although concept-based models promise a good approach to both increasing explainability and reliability, it is yet to be shown if they demonstrate robustness and output consistent concepts under systematic perturbations to their inputs. To better understand performance of concept-based models on curated malicious samples, in this paper, we aim to study their robustness to adversarial perturbations, which are also known as the imperceptible changes to the input data that are crafted by an attacker to fool a well-learned concept-based model. Specifically, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models. Subsequently, we propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks. Extensive experiments on one synthetic and two real-world datasets demonstrate the effectiveness of the proposed attacks and the defense approach. An appendix of the paper with more comprehensive results can also be viewed at https://arxiv.org/abs/2211.16080.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Understanding and Enhancing Robustness of Concept-Based Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 2

Similar Papers

Area under the curve may hide poor generalisation to external datasets
A Kleppe
ESMO Open | VOL. 7
A KleppeA Kleppe
01 Apr 2022
ESMO Open | VOL. 7

Predicting stock market returns from malicious attacks: A comparative analysis of vector autoregression and time-delayed neural networks
Lara Khansa ... Divakaran Liginlal
Decision Support Systems | VOL. 51
Lara Khansa, et. al.Lara Khansa ... Divakaran Liginlal
01 Feb 2011
Decision Support Systems | VOL. 51

A study on the estimation of performance of the concept-based information retrieval model for searching the Web
Y-H Noh
Journal of Information Science | VOL. 28
Y-H NohY-H Noh
01 Oct 2002
Journal of Information Science | VOL. 28

A study on the estimation of performance of the concept-based information retrieval model for searching the Web
Young-Hee Noh
Journal of Information Science | VOL. 28
Young-Hee NohYoung-Hee Noh
01 Oct 2002
Journal of Information Science | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding and Enhancing Robustness of Concept-Based Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence