This study presents the Fast Fruit 3D Detector (FF3D), a novel framework that contains a 3D neural network for fruit detection and an anisotropic Gaussian-based next-best view estimator. The proposed one-stage 3D detector, which utilizes an end-to-end 3D detection network, shows superior accuracy and robustness compared to traditional 2D methods. The core of the FF3D is a 3D object detection network based on a 3D convolutional neural network (3D CNN) followed by an anisotropic Gaussian-based next-best view estimation module. The innovative architecture combines point cloud feature extraction and object detection tasks, achieving accurate real-time fruit localization. The model is trained on a large-scale 3D fruit dataset and contains data collected from an apple orchard. Additionally, the proposed next-best view estimator improves accuracy and lowers the collision risk for grasping. Thorough assessments on the test set and in a simulated environment validate the efficacy of our FF3D. The experimental results show an AP of 76.3%, an AR of 92.3%, and an average Euclidean distance error of less than 6.2 mm, highlighting the framework's potential to overcome challenges in orchard environments.
Read full abstract