Abstract
Autonomous driving relies on trusty visual recognition of surrounding objects. Few-shot image classification is used in autonomous driving to help recognize objects that are rarely seen. Successful embedding and metric-learning approaches to this task normally learn a feature comparison framework between an unseen image and the labeled images. However, these approaches usually have problems with ambiguous feature embedding because they tend to ignore important local visual and semantic information when extracting intra-class common features from the images. In this paper, we introduce a Semantic-Aligned Attention (SAA) mechanism to refine feature embedding and it can be applied to most of the existing embedding and metric-learning approaches. The mechanism highlights pivotal local visual information with attention mechanism and aligns the attentive map with semantic information to refine the extracted features. Incorporating the proposed mechanism into the prototypical network, evaluation results reveal competitive improvements in both few-shot and zero-shot classification tasks on various benchmark datasets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have