Abstract

Currently, many methods of object pose estimation use images or point clouds alone for pose estimation. This leads to their inability to accurately estimate the object pose in the case of occlusion and poor illumination. Second, these models have large parameters and cannot be deployed on mobile devices. Therefore, we propose a lightweight two-terminal feature fusion network, which can effectively use images and point clouds for accurate object pose estimation. First, Pointno problemNet network is used to extract point cloud features. Then the extracted point cloud features are combined with the images at pixel level and the features are extracted by CNN. Secondly, the extracted image features are combined with the point cloud point by point. Then feature extraction is performed on it using the improved PointNet++ network. Finally, a set of center point features are obtained and pose estimation is performed for each feature. The pose with the highest confidence is selected as the final result. Furthermore, we apply depthwise separable convolutions to reduce the amount of model parameters. Experiments show that the proposed method exhibits better performance on Linemod and Occlusion Linemod datasets. Furthermore, the model parameters are small, and it is robust in occlusion and low-light situations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call