ExplainLFS: Explaining neural architectures for similarity learning from local perturbations in the latent feature space

Marilyn Bello,Gonzalo Nápoles,Pablo Costa,Pablo Mesejo,Óscar Cordón

doi:10.1016/j.inffus.2024.102407

Abstract

Despite the increasing development in recent years of explainability techniques for deep neural networks, only some are dedicated to explaining the decisions made by neural networks for similarity learning. While existing approaches can explain classification models, their adaptation to generate visual similarity explanations is not trivial. Neural architectures devoted to this task learn an embedding that maps similar examples to nearby vectors and non-similar examples to distant vectors in the feature space. In this paper, we propose a post-hoc agnostic technique that explains the inference of such architectures on a pair of images. The proposed method establishes a relation between the most important features of the abstract feature space and the input feature space (pixels) of an image. For this purpose, we employ a relevance assignment and a perturbation process based on the most influential latent features in the inference. Then, a reconstruction process of the images of the pair is carried out from the perturbed embedding vectors. This process relates the latent features to the original input features. The results indicate that our method produces “continuous” and “selective” explanations. A sharp drop in the value of the function (summarized by a low value of the area under the curve) indicates its superiority over other explainability approaches when identifying features relevant to similarity learning. In addition, we demonstrate that our technique is agnostic to the specific type of similarity model, e.g., we show its applicability in two similarity learning tasks: face recognition and image retrieval.

Full Text