Abstract

Recently, most state-of-the-art object detection systems adopt anchor box mechanism to simplify the detection model. Neural networks only need to regress the mapping relations from anchor boxes to ground truth boxes, then prediction boxes can be calculated using information from outputs of networks and default anchor boxes. However, when the problem becomes complex, the number of default anchor boxes will increase with large risk of over-fitting during training. In this paper, we adopt an adaptive anchor box mechanism that one anchor box can cover more ground truth boxes. So networks only need a few adoptive anchor boxes to solve the same problem and the model will be more robust. The sizes of adaptive anchor boxes will be adjusted automatically according to the depth collected by a Time of Flight (TOF) camera. The network adjusts the aspect ratios of anchor boxes to get final prediction boxes. The experimental results demonstrate that the proposed method can get more accurate detection results. Specifically, using the proposed adaptive anchor box mechanism, the Mean Average Precision (mAP) of YOLO-v2 and YOLO-v3 networks increases obviously on open public datasets and our self-built battery image dataset. Moreover, the visual results of prediction comparisons also illustrate that the proposed adaptive anchor box mechanism can achieve better performance than original anchor box mechanism.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call