Implicit and explicit attention mechanisms for zero-shot learning

Faisal Alamri,Anjan Dutta

doi:10.1016/j.neucom.2023.03.009

Faisal Alamri, Anjan Dutta

https://doi.org/10.1016/j.neucom.2023.03.009

Copy DOI

Export

Save

Cite

Journal: Neurocomputing	Publication Date: Mar 9, 2023
Citations: 7

Affiliation: University of Surrey

Abstract
Full-Text
Similar Papers

Abstract

Listen

Zero-Shot Learning (ZSL) aims to recognise unseen object classes which are not observed during the training phase. Most of the existing methods on ZSL focus on learning a compatibility function between the image representation and class semantic information. Few others concentrate on learning image representation by combining local and global features. However, the existing approaches still fail to address the bias issue towards the seen classes. This paper proposes implicit and explicit attention mechanisms to address the existing bias problem in generalised ZSL models. We formulate the implicit attention mechanism with a self-supervised image angle rotation task, which focuses on specific image features aiding in solving the task. On the other hand, the explicit attention mechanism is composed via the consideration of a multi-headed self-attention mechanism in the Vision Transformer model, which learns to attend important image locations and map global image features to semantic space during the training stage. We have conducted comprehensive experiments on three popular benchmarks: AWA2, CUB and SUN, where the effectiveness of our proposed attention mechanisms is shown in both discriminative and generative settings. Our extensive experiments show that our method has achieved state-of-the-art performance obtaining the highest harmonic mean on all three datasets, which is very encouraging to consider the ViT-based attention mechanisms for ZSL tasks in the future.

Full Text