Abstract
Human life is populated with articulated objects. Current Category-level Articulation Pose Estimation (CAPE) methods are studied under the single-instance setting with a fixed kinematic structure for each category. Considering these limitations, we aim to study the problem of estimating part-level 6D pose for multiple articulated objects with unknown kinematic structures in a single RGB-D image, and reform this problem setting for real-world environments and suggest a CAPE-Real (CAPER) task setting. This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world. To support this task, we build an articulated model repository ReArt-48 and present an efficient dataset generation pipeline, which contains Fast Articulated Object Modeling (FAOM) and Semi-Authentic MixEd Reality Technique (SAMERT). Accompanying the pipeline, we build a large-scale mixed reality dataset ReArtMix and a real world dataset ReArtVal. Accompanying the CAPER problem and the dataset, we propose an effective framework that exploits RGB-D input to estimate part-level pose for multiple instances in a single forward pass. In our method, we introduce object detection from RGB-D input to handle the multi-instance problem and segment each instance into several parts. To address the unknown kinematic structure issue, we propose an Articulation Parsing Network to analyze the structure of detected instance, and also build a Pair Articulation Pose Estimation module to estimate per-part 6D pose as well as joint property from connected part pairs. Extensive experiments demonstrate that the proposed method can achieve good performance on CAPER, CAPE and instance-level Robot Arm pose estimation problems. We believe it could serve as a strong baseline for future research on the CAPER task. The datasets and codes in our work will be made publicly available.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.