Improving small objects detection using transformer

Shikha Dubey,Farrukh Olimov,Muhammad Aasim Rafique,Moongu Jeon

doi:10.1016/j.jvcir.2022.103620

Shikha Dubey, Farrukh Olimov + Show 2 more

Open Access

https://doi.org/10.1016/j.jvcir.2022.103620

Copy DOI

Abstract

General artificial intelligence counteracts the inductive bias of an algorithm and tunes the algorithm for out-of-distribution generalization. A conspicuous impact of the inductive bias is an unceasing trend in improving deep learning performance. Although a quintessential attention-based object detection technique, DETR, shows better accuracy than its predecessors, its accuracy deteriorates for detecting small-sized (in-perspective) objects. This study examines the inductive bias of DETR and proposes a normalized inductive bias for object detection using data fusion, SOF-DETR. A technique of lazy-fusion of features is introduced in SOF-DETR, which sustains deep contextual information of objects present in an image. The features from multiple subsequent deep layers are fused for object queries that learn long and short-distance spatial association in an image using the attention mechanism. Experimental results on the MS COCO and Udacity Self Driving Car datasets assert the effectiveness of the added normalized inductive bias and feature fusion techniques, showing increased COCO mAP scores on small-sized objects.

Full Text