3D Keypoints Research Articles

In this paper, a manipulation planning method for object re-orientation based on semantic segmentation keypoint detection is proposed for robot manipulator which is able to detect and re-orientate the randomly placed objects to a specified position and pose. There are two main parts: (1) 3D keypoint detection system; and (2) manipulation planning system for object re-orientation. In the 3D keypoint detection system, an RGB-D camera is used to obtain the information of the environment and can generate 3D keypoints of the target object as inputs to represent its corresponding position and pose. This process simplifies the 3D model representation so that the manipulation planning for object re-orientation can be executed in a category-level manner by adding various training data of the object in the training phase. In addition, 3D suction points in both the object’s current and expected poses are also generated as the inputs of the next operation stage. During the next stage, Mask Region-Convolutional Neural Network (Mask R-CNN) algorithm is used for preliminary object detection and object image. The highest confidence index image is selected as the input of the semantic segmentation system in order to classify each pixel in the picture for the corresponding pack unit of the object. In addition, after using a convolutional neural network for semantic segmentation, the Conditional Random Fields (CRFs) method is used to perform several iterations to obtain a more accurate result of object recognition. When the target object is segmented into the pack units of image process, the center position of each pack unit can be obtained. Then, a normal vector of each pack unit’s center points is generated by the depth image information and pose of the object, which can be obtained by connecting the center points of each pack unit. In the manipulation planning system for object re-orientation, the pose of the object and the normal vector of each pack unit are first converted into the working coordinate system of the robot manipulator. Then, according to the current and expected pose of the object, the spherical linear interpolation (Slerp) algorithm is used to generate a series of movements in the workspace for object re-orientation on the robot manipulator. In addition, the pose of the object is adjusted on the z-axis of the object’s geodetic coordinate system based on the image features on the surface of the object, so that the pose of the placed object can approach the desired pose. Finally, a robot manipulator and a vacuum suction cup made by the laboratory are used to verify that the proposed system can indeed complete the planned task of object re-orientation.

6D pose estimation is an important branch in the field of vision measurement, and is widely used in the fields of robotics, autonomous driving and reality augmentation. The latest research trend in 6D pose estimation is to train a deep neural network to directly predict the 2D projection position of the 3D keypoint from the image, establish the corresponding relationship, and finally use the perspective-n-point (PnP) algorithm to perform pose estimation. The current challenge of pose estimation is that when objects are textureless, occluded or scene-clutterd, the detection accuracy is reduced, and most of the existing algorithm models are large and cannot accommodate real-time requirements. In this paper, we introduce a densely connected feature pyramid network (DFPN) that can efficiently integrate and utilize features. We combine the cross-stage partial network (CSPNet) with DFPN to design a new network for 6D pose estimation, DFPN-6D, a new approach for 6D object pose estimation. DFPN-6D can efficiently and accurately handle objects with textureless, occluded and scene clutter and estimate their full 6D poses in a single shot. Furthermore, we propose a new confidence calculation method and loss function for object pose estimation, which can fully consider spatial information. Finally, we propose a novel augmentation method for direct 6D pose estimation approaches to improve performance and generalization ability in the case of occlusion, which is called 6D augmentation. Our approach achieves a new state-of-the-art accuracy of 98.06 and 87.09 in terms of the ADD(-S) metric on the Linemod dataset and Occluded-Linemod dataset, and our method also achieves the best result in terms of the different metric on the MULT-I dataset, BIN-P dataset and T-LESS dataset, respectively, while still running end-to-end at over 65 FPS. The experimental results demonstrate that our algorithm is robust to textureless materials and occlusion while running more efficiently than other methods. We also deploy our proposed method to a real robot to grasp and manipulate objects based on the estimated pose.

3D Keypoints Research Articles

Related Topics

Articles published on 3D Keypoints

VGF-Net: Visual-Geometric fusion learning for simultaneous drone navigation and height mapping

ASPset: An outdoor sports pose video dataset with 3D keypoint annotations

ASPP-DF-PVNet: Atrous Spatial Pyramid Pooling and Distance-Filtered PVNet for occlusion resistant 6D object pose estimation

Pointless Pose: Part Affinity Field-Based 3D Pose Estimation without Detecting Keypoints

Manipulation Planning for Object Re-Orientation Based on Semantic Segmentation Keypoint Detection.

Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image.

Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation

Anatomy-Aware 3D Human Pose Estimation With Bone-Based Pose Decomposition

Detecting Object Surface Keypoints From a Single RGB Image via Deep Learning Network for 6-DoF Pose Estimation

RGB-D Videos-Based Early Prediction of Infant Cerebral Palsy via General Movements Complexity

A Practical O(N2) Outlier Removal Method for Correspondence-Based Point Cloud Registration.

Real-Time and Efficient 6-D Pose Estimation From a Single RGB Image

Understanding Pixel-Level 2D Image Semantics With 3D Keypoint Knowledge Engine.

Attitude Mounting Misalignment Estimation Method for the Calibration of UAV LiDAR System by using a TIN-based Corresponding Model

Attitude Mounting Misalignment Estimation Method for the Calibration of UAV LiDAR System by using a TIN-based Corresponding Model

RGB2Hands

New multi‐view human motion capture framework

Learning Semantic Keypoint Representations for Door Opening Manipulation

A multi-task Faster R-CNN method for 3D vehicle detection based on a single image

Human 3D pose estimation in a lying position by RGB-D images for medical diagnosis and rehabilitation.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

3D Keypoints Research Articles

Related Topics

Articles published on 3D Keypoints

VGF-Net: Visual-Geometric fusion learning for simultaneous drone navigation and height mapping

ASPset: An outdoor sports pose video dataset with 3D keypoint annotations

ASPP-DF-PVNet: Atrous Spatial Pyramid Pooling and Distance-Filtered PVNet for occlusion resistant 6D object pose estimation

Pointless Pose: Part Affinity Field-Based 3D Pose Estimation without Detecting Keypoints

Manipulation Planning for Object Re-Orientation Based on Semantic Segmentation Keypoint Detection.

Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image.

Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation

Anatomy-Aware 3D Human Pose Estimation With Bone-Based Pose Decomposition

Detecting Object Surface Keypoints From a Single RGB Image via Deep Learning Network for 6-DoF Pose Estimation

RGB-D Videos-Based Early Prediction of Infant Cerebral Palsy via General Movements Complexity

A Practical O(N2) Outlier Removal Method for Correspondence-Based Point Cloud Registration.

Real-Time and Efficient 6-D Pose Estimation From a Single RGB Image

Understanding Pixel-Level 2D Image Semantics With 3D Keypoint Knowledge Engine.

Attitude Mounting Misalignment Estimation Method for the Calibration of UAV LiDAR System by using a TIN-based Corresponding Model

Attitude Mounting Misalignment Estimation Method for the Calibration of UAV LiDAR System by using a TIN-based Corresponding Model

RGB2Hands

New multi‐view human motion capture framework

Learning Semantic Keypoint Representations for Door Opening Manipulation

A multi-task Faster R-CNN method for 3D vehicle detection based on a single image

Human 3D pose estimation in a lying position by RGB-D images for medical diagnosis and rehabilitation.