Towards improving classification power for one-shot object detection

Hanqing Yang,Yongliang Lin,Hong Zhang,Yu Zhang,Bin Xu

doi:10.1016/j.neucom.2021.04.116

Abstract

Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.

Full Text