This paper presents an experimental comparison between two existing methods representative of two categories of 6D pose estimation algorithms nowadays commonly used in the robotics community. The first category includes purely deep learning methods, while the second one includes hybrid approaches combining learning pipelines and geometric reasoning. The hybrid method considered in this paper is a pipeline of an instance-level deep neural network based on RGB data only and a geometric pose refinement algorithm based on the availability of the depth map and the CAD model of the target object. Such a method can handle objects whose dimensions differ from those of the CAD. The pure learning method considered in this comparison is DenseFusion, a consolidated state-of-the-art pose estimation algorithm selected because it uses the same input data, namely, RGB image and depth map. The comparison is carried out by testing the success rate of fresh food pick-and-place operations. The fruit-picking scenario has been selected for the comparison because it is challenging due to the high variability of object instances in appearance and dimensions. The experiments carried out with apples and limes show that the hybrid method outperforms the pure learning one in terms of accuracy, thus allowing the pick-and-place operation of fruits with a higher success rate. An extensive discussion is also presented to help the robotics community select the category of 6D pose estimation algorithms most suitable to the specific application.
Read full abstract