3D Object Pose Estimation Research Articles

The 6D pose estimation using RGBD images plays a pivotal role in robotics applications. At present, after obtaining the RGB and depth modality information, most methods directly concatenate them without considering information interactions. This leads to the low accuracy of 6D pose estimation in occlusion and illumination changes. To solve this problem, we propose a new method to fuse RGB and depth modality features. Our method effectively uses individual information contained within each RGBD image modality and fully integrates cross-modality interactive information. Specifically, we transform depth images into point clouds, applying the PointNet++ network to extract point cloud features; RGB image features are extracted by CNNs and attention mechanisms are added to obtain context information within the single modality; then, we propose a cross-modality feature fusion module (CFFM) to obtain the cross-modality information, and introduce a feature contribution weight training module (CWTM) to allocate the different contributions of the two modalities to the target task. Finally, the result of 6D object pose estimation is obtained by the final cross-modality fusion feature. By enabling information interactions within and between modalities, the integration of the two modalities is maximized. Furthermore, considering the contribution of each modality enhances the overall robustness of the model. Our experiments indicate that the accuracy rate of our method on the LineMOD dataset can reach 96.9%, on average, using the ADD (-S) metric, while on the YCB-Video dataset, it can reach 94.7% using the ADD-S AUC metric and 96.5% using the ADD-S score (<2 cm) metric.

Read full abstract

Nowadays, the potential benefits and implementation of autonomous driving have attracted widespread attention from both industry and academia. This study will solve view-invariant object detection and semantic key-point pose assumptions from a single RGB image. A machine learning method for estimating the absolute pose of an on-road vehicle for autonomous driving from monocular vision alone without the help of additional sensors is a complex task. The main purpose of this work is to identify other vehicles on the road and estimate their exact angular position from a single image with improved accuracy. The focus of the study is to create a new algorithm by applying a potentially deep convoluted neural network followed by a repetitive neural structure for more accurate 6D pose inference. A 6D pose hypothesis is presented in this study, based on a deep hybrid architecture for individual vehicles of an end-to-end approach to a task consisting of a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). In this work, we will use a large-scale dataset consistent with the understanding of a 3D car instance called ApolloCar3D. The data set contains 5,277 real-life street scenes with examples of about 60K cars. By comparison, the ApolloCar3D is twenty times larger than the PASCAL3D+ and KITTI datasets. Ultimately, the idea is to efficiently eliminate motionless cars and predict the next pose given in the speed context, allowing a comprehensive evaluation, and passing the output through LSTM (long short-term memory) with an additional filter layer. The new filter added to the LSTM will efficiently filter and isolate stationary or parking vehicles and focus on on-road vehicles. Since the LSTM has a non-linear high-dimensional hidden memory state, it can preserve the past continuity of each generation’s data history and pay more attention to those road vehicles rather than parked or stationary vehicles to act accordingly. So for each new vehicle, the pose estimator classifier can use LSTM memory and compare the historical pose with the newly filtered data. The successful implementation of this innovative concept will lead to significant improvements in the real-life traffic situation in the field of computer vision and autonomous driving.

Read full abstract

3D Object Pose Estimation Research Articles

Related Topics

Articles published on 3D Object Pose Estimation

6D Object Pose Estimation Based on Cross-Modality Feature Fusion.

Self-supervised Vision Transformers for 3D pose estimation of novel objects

OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over.

Learning geometric consistency and discrepancy for category-level 6D object pose estimation from point clouds

OTOT: An online training and offline testing system for 6D Object Pose Estimation

YOLOPose V2: Understanding and improving transformer-based 6D pose estimation

Recent Advances and Perspectives in Deep Learning Techniques for 3D Point Cloud Data Processing

MORE: simultaneous multi-view 3D object recognition and pose estimation

SANet: A novel segmented attention mechanism and multi-level information fusion network for 6D object pose estimation

Imitrob: Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators

Self-Supervised Category-Level 6D Object Pose Estimation With Optical Flow Consistency

6D Object Localization in Car-Assembly Industrial Environment

Deep learning for 6D pose estimation of objects — A case study for autonomous driving

Weak6D: Weakly Supervised 6D Pose Estimation With Iterative Annotation Resolver

Object-Aware 3D Scene Reconstruction from Single 2D Images of Indoor Scenes

Geometric-aware dense matching network for 6D pose estimation of objects from RGB-D images

Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation

Towards Robot-Assisted Data Generation with Minimal User Interaction for Autonomously Training 6D Pose Estimation in Operational Environments

VP-KLNet: efficient 6D object pose estimation with an enhanced vector-field prediction network and a keypoint localization network

Category-Level 6D Object Pose Estimation With Structure Encoder and Reasoning Attention

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

3D Object Pose Estimation Research Articles

Related Topics

Articles published on 3D Object Pose Estimation

6D Object Pose Estimation Based on Cross-Modality Feature Fusion.

Self-supervised Vision Transformers for 3D pose estimation of novel objects

OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over.

Learning geometric consistency and discrepancy for category-level 6D object pose estimation from point clouds

OTOT: An online training and offline testing system for 6D Object Pose Estimation

YOLOPose V2: Understanding and improving transformer-based 6D pose estimation

Recent Advances and Perspectives in Deep Learning Techniques for 3D Point Cloud Data Processing

MORE: simultaneous multi-view 3D object recognition and pose estimation

SANet: A novel segmented attention mechanism and multi-level information fusion network for 6D object pose estimation

Imitrob: Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators

Self-Supervised Category-Level 6D Object Pose Estimation With Optical Flow Consistency

6D Object Localization in Car-Assembly Industrial Environment

Deep learning for 6D pose estimation of objects — A case study for autonomous driving

Weak6D: Weakly Supervised 6D Pose Estimation With Iterative Annotation Resolver

Object-Aware 3D Scene Reconstruction from Single 2D Images of Indoor Scenes

Geometric-aware dense matching network for 6D pose estimation of objects from RGB-D images

Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation

Towards Robot-Assisted Data Generation with Minimal User Interaction for Autonomously Training 6D Pose Estimation in Operational Environments

VP-KLNet: efficient 6D object pose estimation with an enhanced vector-field prediction network and a keypoint localization network

Category-Level 6D Object Pose Estimation With Structure Encoder and Reasoning Attention