Abstract

Three-dimensional (3D) object detection plays an important role in computer vision and intelligent transportation systems. The location and direction of obstacles in a road scene can be specified to provide navigation for unmanned vehicles. In this paper, we propose a novel network architecture called Frustum FusionNet (F-FusionNet), which can effectively extract and concatenate features from frustum point clouds and RGB images to generate amodal 3D object detection results. To simultaneously detect objects of different sizes, our method divides each frustum point cloud into continuous segments. Our MSE-Net module fully extracts and fuses segment-wise local features of different scales by utilizing a multi-scale sliding window and segment-wise adaptive learning fusion algorithm. Moreover, the image features are aggregated to refine the 3D object detection using F-FusionNet. For different objects, our robust method has exactly the same network architecture and parameters for practicability. Our method is evaluated on a road scene from the KITTI dataset. Extensive experiments and comparisons were conducted on the KITTI benchmark, which demonstrates the effectiveness of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.