Few-Shot Image Classification Based on Cross-Dimensional Interactive Attention

Ying Liu,Yilun Bai,Hengchang Zhang,Xin Che

doi:10.1145/3573942.3574099

Abstract

In recent years, deep learning techniques have achieved great success in traditional image classification tasks, however, it is often difficult to achieve good results with a small amount of labeled data and prone to overfitting. Therefore, scholars have started to focus on image classification methods based on few-shot learning. The prototype network uses the mean value of the support set samples as the prototype, and achieves classification by calculating the distance between the query set samples and the prototype. To enhance the feature representation capability of the prototype network, this paper proposes a few-shot image classification method based on cross-dimensional interactive attention. The algorithm uses the pre-trained model Resnet-12 to extract deep features of images and introduces the cross-dimensional interactive attention mechanism to link the information between channel and spatial dimensions through triple attention, which enhances the information interaction of each dimension. Meanwhile, in order to improve the problem of insufficient generalization ability of the prototype network, this algorithm uses a gradient-centered optimization algorithm to zero-mean the weight gradient, which improves the generalization ability of the network and improves the classification accuracy. Extensive experimental results show that the proposed algorithm performs well in the few-shot image classification task.

Full Text