6DoF Object Pose Research Articles

Accurate 6DoF (degrees of freedom) pose and focal length estimation are important in extended reality (XR) applications, enabling precise object alignment and projection scaling, thereby enhancing user experiences. This study focuses on improving 6DoF pose estimation using single RGB images of unknown camera metadata. Estimating the 6DoF pose and focal length from an uncontrolled RGB image, obtained from the internet, is challenging because it often lacks crucial metadata. Existing methods such as FocalPose and Focalpose++ have made progress in this domain but still face challenges due to the projection scale ambiguity between the translation of an object along the z-axis (tz) and the camera's focal length. To overcome this, we propose a two-stage strategy that decouples the projection scaling ambiguity in the estimation of z-axis translation and focal length. In the first stage, tz is set arbitrarily, and we predict all the other pose parameters and focal length relative to the fixed tz. In the second stage, we predict the true value of tz while scaling the focal length based on the tz update. The proposed two-stage method reduces projection scale ambiguity in RGB images and improves pose estimation accuracy. The iterative update rules constrained to the first stage and tailored loss functions including Huber loss in the second stage enhance the accuracy in both 6DoF pose and focal length estimation. Experimental results using benchmark datasets show significant improvements in terms of median rotation and translation errors, as well as better projection accuracy compared to the existing state-of-the-art methods. In an evaluation across the Pix3D datasets (chair, sofa, table, and bed), the proposed two-stage method improves projection accuracy by approximately 7.19%. Additionally, the incorporation of Huber loss resulted in a significant reduction in translation and focal length errors by 20.27% and 6.65%, respectively, in comparison to the Focalpose++ method.

Six-degree-of-freedom (6DoF) object pose estimation is a crucial task for virtual reality and accurate robotic manipulation. Category-level 6DoF pose estimation has recently become popular as it improves generalization to a complete category of objects. However, current methods focus on data-driven differential learning, which makes them highly dependent on the quality of the real-world labeled data and limits their ability to generalize to unseen objects. To address this problem, we propose multi-hypothesis (MH) consistency learning (MH6D) for category-level 6-D object pose estimation without using real-world training data. MH6D uses a parallel consistency learning structure, alleviating the uncertainty problem of single-shot feature extraction and promoting self-adaptation of domain to reduce the synthetic-to-real domain gap. Specifically, three randomly sampled pose transformations are first performed in parallel on the input point cloud. An attention-guided category-level 6-D pose estimation network with channel attention (CA) and global feature cross-attention (GFCA) modules is then proposed to estimate the three hypothesized 6-D object poses by extracting and fusing the global and local features effectively. Finally, we propose a novel loss function that considers both the process and the final result information allowing MH6D to perform robust consistency learning. We conduct experiments under two different training data settings (i.e., only synthetic data and synthetic and real-world data) to verify the generalization ability of MH6D. Extensive experiments on benchmark datasets demonstrate that MH6D achieves state-of-the-art (SOTA) performance, outperforming most data-driven methods even without using any real-world data. The code is available at https://github.com/CNJianLiu/MH6D.

6DoF Object Pose Research Articles

Related Topics

Articles published on 6DoF Object Pose

6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments.

RNNPose: 6-DoF Object Pose Estimation via Recurrent Correspondence Field Estimation and Pose Optimization.

A Survey of 6DoF Object Pose Estimation Methods for Different Application Scenarios.

MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation.

Line-Based 6-DoF Object Pose Estimation and Tracking With an Event Camera.

Deep Learning-Based 6-DoF Object Pose Estimation Considering Synthetic Dataset.

Enhancing 6-DoF Object Pose Estimation through Multiple Modality Fusion: A Hybrid CNN Architecture with Cross-Layer and Cross-Modal Integration

Fine segmentation and difference-aware shape adjustment for category-level 6DoF object pose estimation

Bin Picking for Ship-Building Logistics Using Perception and Grasping Systems

KVNet: An iterative 3D keypoints voting network for real-time 6-DoF object pose estimation

6D Object Pose Tracking with Optical Flow Network for Robotic Manipulation

Spatial feature mapping for 6DoF object pose estimation

DeepNet-Based 3D Visual Servoing Robotic Manipulation

Semantic keypoint-based pose estimation from single RGB frames

DRNet: A Depth-Based Regression Network for 6D Object Pose Estimation.

Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation

End-to-End 6DoF Pose Estimation From Monocular RGB Images

SynPo-Net-Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training.

PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation.

6DoF Pose Estimation of Transparent Object from a Single RGB-D Image.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

6DoF Object Pose Research Articles

Related Topics

Articles published on 6DoF Object Pose

6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments.

RNNPose: 6-DoF Object Pose Estimation via Recurrent Correspondence Field Estimation and Pose Optimization.

A Survey of 6DoF Object Pose Estimation Methods for Different Application Scenarios.

MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation.

Line-Based 6-DoF Object Pose Estimation and Tracking With an Event Camera.

Deep Learning-Based 6-DoF Object Pose Estimation Considering Synthetic Dataset.

Enhancing 6-DoF Object Pose Estimation through Multiple Modality Fusion: A Hybrid CNN Architecture with Cross-Layer and Cross-Modal Integration

Fine segmentation and difference-aware shape adjustment for category-level 6DoF object pose estimation

Bin Picking for Ship-Building Logistics Using Perception and Grasping Systems

KVNet: An iterative 3D keypoints voting network for real-time 6-DoF object pose estimation

6D Object Pose Tracking with Optical Flow Network for Robotic Manipulation

Spatial feature mapping for 6DoF object pose estimation

DeepNet-Based 3D Visual Servoing Robotic Manipulation

Semantic keypoint-based pose estimation from single RGB frames

DRNet: A Depth-Based Regression Network for 6D Object Pose Estimation.

Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation

End-to-End 6DoF Pose Estimation From Monocular RGB Images

SynPo-Net-Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training.

PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation.

6DoF Pose Estimation of Transparent Object from a Single RGB-D Image.