Do Neural Transformers Learn Human-Defined Concepts? An Extensive Study in Source Code Processing Domain

Claudio Ferretti,Martina Saletta

doi:10.3390/a15120449

Abstract

State-of-the-art neural networks build an internal model of the training data, tailored to a given classification task. The study of such a model is of interest, and therefore, research on explainable artificial intelligence (XAI) aims at investigating if, in the internal states of a network, it is possible to identify rules that associate data to their corresponding classification. This work moves toward XAI research on neural networks trained in the classification of source code snippets, in the specific domain of cybersecurity. In this context, typically, textual instances have firstly to be encoded with non-invertible transformation into numerical vectors to feed the models, and this limits the applicability of known XAI methods based on the differentiation of neural signals with respect to real valued instances. In this work, we start from the known TCAV method, designed to study the human understandable concepts that emerge in the internal layers of a neural network, and we adapt it to transformers architectures trained in solving source code classification problems. We first determine domain-specific concepts (e.g., the presence of given patterns in the source code), and for each concept, we train support vector classifiers to separate points in the vector activation spaces that represent input instances with the concept from those without the concept. Then, we study if the presence (or the absence) of such concepts affects the decision process of the neural network. Finally, we discuss about how our approach contributes to general XAI goals and we suggest specific applications in the source code analysis field.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Nov 29, 2022
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Do Neural Transformers Learn Human-Defined Concepts? An Extensive Study in Source Code Processing Domain

Abstract

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Explainable artificial intelligence in deep learning neural nets-based digital images analysis
A N Averkin ... E N Volkov
Известия Российской академии наук. Теория и системы управления | VOL. -
A N Averkin, et. al.A N Averkin ... E N Volkov
15 Feb 2024
Известия Российской академии наук. Теория и системы управления | VOL. -

Explainable Artificial Intelligence: Rules Extraction from Neural Networks
Alexey Averkin ... Sergey Yarushev
-
Alexey Averkin, et. al.Alexey Averkin ... Sergey Yarushev
01 Jan 2021
01 Jan 2021

LGGNet: Learning From Local-Global-Graph Representations for Brain-Computer Interface.
Yi Ding ... Qiuhao Zeng
IEEE transactions on neural networks and learning systems | VOL. 35
Yi Ding, et. al.Yi Ding ... Qiuhao Zeng
01 Jul 2024
IEEE transactions on neural networks and learning systems | VOL. 35

AI-Powered Vulnerability Detection for Secure Source Code Development
Sampath Rajapaksha ... Harsha Kalutarage
-
Sampath Rajapaksha, et. al.Sampath Rajapaksha ... Harsha Kalutarage
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Do Neural Transformers Learn Human-Defined Concepts? An Extensive Study in Source Code Processing Domain

Abstract

Talk to us

Similar Papers

More From: Algorithms