Abstract

In-field object detection and pose estimation are challenging tasks in industrial harvesting scenarios. This study investigates a method for object detection and 6D pose estimation using a depth camera to prevent harvesting robots from collisions while harvesting grape clusters. First, the Mask Region Convolutional Neural Network (Mask R-CNN) is deployed to segment 2D images and output binary mask images of grape clusters. Second, the grape cluster point cloud is segmented based on the binary mask image and the mapping relationship between the image and point cloud to provide a high-quality point cloud through pre-processing. Third, the optimal cutting point is located by constructing the grape peduncle region of interest (RoI). Finally, the peduncle surface is fit based on the locally weighted scatterplot smoothing (LOWESS) algorithm, and the pose of the peduncle is estimated using geometric methods. The detection results from 238 test images show that the mean Precision (mP) is 86.0%, the mean Recall (mR) is 79.9%, the F1-score is 0.828, and mean Intersection Over Union (mIOU) for instance segmentation is 87.9%. The pose estimation results from 172 grape peduncles yield an error angle of 22.22 degrees ± 17.96 degrees in pose estimation. The detection and pose estimations for each grape cluster require approximately 1.786 s. The demonstrated performance of the proposed method indicates it can be applied to collision-free grape harvesting using robots in unstructured environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call