Abstract

Few-shot object detection (FSOD) aims to address the challenge of requiring a substantial number of annotations for training in conventional object detection, which is very labor-intensive. However, the existing few-shot methods achieve high precision with the sacrifice of time for exhaustive fine-tuning or have poor performance in novel-class adaptation. We presume the major reason is that the valuable correlation feature among different categories is insufficiently exploited, hindering the generalization of knowledge from base to novel categories for object detection. In this paper, we propose few-shot object detection via Correlation-RPN and transformer encoder–decoder (CRTED), a novel training network to learn object-relevant features of inter-class correlation and intra-class compactness while suppressing object-agnostic features in the background with limited annotated samples. And we also introduce a four-way tuple-contrast training strategy to positively activate the training progress of our object detector. Experiments over two few-shot benchmarks (Pascal VOC, MS-COCO) demonstrate that our proposed CRTED without further fine-tuning can achieve comparable performance with current state-of-the-art fine-tuned works. The codes and pre-trained models will be released.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call