Abstract
In this study, we present a three-dimensional (3D) object detection algorithm based on monocular images by constructing an end-to-end network, that incorporates depth information. The entire network consists of three parts. The first part includes the basic object detection neural network as the main body, that uses the region proposal network to obtain the two-dimensional (2D) region proposal of the object. The second part is the depth estimation branch network, that obtains the depth information of the object pixels and calculates the corresponding 3D point cloud. In the last part, concatenated features obtained from the aforementioned two parts are fed into the fully-connected layers. Subsequently, 2D and 3D detection results are obtained. Compared with certain existing methods, the accuracy of the detection results is improved in this study.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have