In supervised deep learning object detection, the quantity of object information and annotation quality in a dataset affect model performance. To augment object detection datasets while maintaining contextual information between objects and backgrounds, we proposed a Background Instance-Based Copy-Paste (BIB-Copy-Paste) data augmentation model. We devised a method to generate background pseudo-labels for all object classes by calculating the similarity between object background features and image region features in Euclidean space. The background classifier, trained with these pseudo-labels, can guide copy-pasting to ensure contextual relevance. Several supervised object detectors were evaluated on the PASCAL VOC 2012 dataset, achieving a 1.1% average improvement in mean average precision. Ablation experiments with the BlitzNet object detector on the PASCAL VOC 2012 dataset showed an improvement of mAP by 1.19% using the proposed method, compared to a 0.18% improvement with random copy-paste. Images from the MS COCO dataset containing objects of the same classes as in PASCAL VOC 2012 were also selected for object pasting experiments. The contextual relevance of pasted objects demonstrated our model’s effectiveness and transferability between datasets with same class of objects.
Read full abstract