Abstract

Object detectors are of two types in state-of-the-art (SOTA) approaches, i.e. the two-stage detectors (Mask-RCNN, Faster-RCNN, Fast-RCNN) and one stage detector (SSD and YOLO). In two-stage detectors, first, generate region proposals and extract deep features for bounding box regression and classification for object detection. These two-stage models achieve a higher accuracy rate but seem slow in performance. Hence, in one-stage detector takes the image as input with region proposal generations and object detection is performed through regression and classification only. Hence these methods show a lower accuracy, however, they are more robust than two-stage detectors. In our research, we examine both types of detectors including Mask RCNN, SSD, and Retina Net and compare them by varying back-bone CNN network architectures i.e. (Inception V2, ResNet 50). These methods are evaluated on a subset of challenging datasets PASCAL-VOC 2012 and MS-COCO.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call