Abstract

In both 2D and 3D object detection algorithms, the reference boxes obtained by the anchor mechanism lay the foundation for the next detection tasks of the network. Most of the existing anchor mechanisms are designed by hand to generate a dense array of anchors. The size of the anchors obtained in this way is single and the anchors are densely distributed in the image space. It has poor robustness and a large number of redundant anchors. In views of these defects, we propose the 3D anchor generating network. It predicts the 3D anchors by learning the semantic features of the picture, in which the anchors are sparsely distributed around the objects in the image with different sizes in different positions of the image. We applied it to TLNet’s baseline monocular network for 3D object detection on KITTI dataset, the result showed a significant improvement on the performance of 3D object detection algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call