Abstract

Few-shot object detection (FSOD) aims at training an object detector that can rapidly adapt to detect novel classes with a few annotation examples. Existing meta-learning-based FSOD networks have achieved substantial progress, however, they still suffer from several drawbacks: they neglect the fact that more discriminative support features can boost the performance of few-shot learning; the commonly used channel-wise features interaction lacks the spatial information, which may lead to a typical problem that the object is well-localized but given a misclassified label. We study how to leverage the powerful attention mechanism and margin-based softmax loss to tackle the FSOD task. Specifically, we select the cosine margin loss that allows learned features with minimum within-class variance and maximum between-class variance to optimize the lightweight convolutional neural networks of the independent support set branch, which endows the extracted support features with better discrimination. In addition, we design an affinity relation reasoning module (ARRM) to promote the interaction of the support features and the region of interest (ROI) features. The ARRM fully explores the element-wise spatial attention to integrate distinct features via the affinity matrix that measures the relationship between the support features and ROI features. The ARRM also introduces holistic channel attention as a supplement to spatial attention. The holistic channel attention provides global semantic context about support features, which can alleviate the misclassification problem. We empirically evaluate the proposed network on Pascal visual object classes and Microsoft common objects in context benchmarks, and the experimental results demonstrate that our network achieves state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call