Abstract

AbstractThe region proposal network is indispensable to two‐stage object detection methods. It generates a fixed number of proposals that are to be classified and regressed by detection heads to produce detection boxes. However, the fixed number of proposals may be too large when an image contains only a few objects but too small when it contains much more objects. Considering this, the authors explored determining a proper number of proposals according to the number of objects in an image to reduce the computational cost while improving the detection accuracy. Since the number of ground truth objects is unknown at the inference stage, the authors designed a simple but effective module to predict the number of foreground regions, which will be substituted for the number of objects for determining the proposal number. Experimental results of various two‐stage detection methods on different datasets, including MS‐COCO, PASCAL VOC, and CrowdHuman showed that equipping the designed module increased the detection accuracy while decreasing the FLOPs of the detection head. For example, experimental results on the PASCAL VOC dataset showed that applying the designed module to Libra R‐CNN and Grid R‐CNN increased over 1.5 AP50 while decreasing the FLOPs of detection heads from 28.6 G to nearly 9.0 G.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call