Semantic segmentation of fruits on multi-sensor fused data in natural orchards

Hanwen Kang,Xing Wang

doi:10.1016/j.compag.2022.107569

Abstract

Semantic segmentation is a fundamental vision task for agricultural robots to understand the surrounding environments in natural orchards. The recent development of the LiDAR techniques enables the robot to acquire accurate range measurements of the view, which have rich geometrical information compared to the RGB images. By combining the point cloud and color, rich features on geometries and textures can be obtained. In this work, we propose a deep-learning-based segmentation method to perform accurate semantic segmentation on fused data from a LiDAR-Camera visual sensor. Two critical problems are explored and solved in this work. The first one is how to efficiently fused the texture and geometrical features from multi-sensor data. The second one is how to efficiently train the 3D segmentation network under severely imbalanced class conditions. Moreover, an implementation of 3D segmentation in orchards including LiDAR-Camera data fusion, data collection and labeling, network training, and model inference is introduced in detail. In the experiment, we comprehensively analyze the network setup when dealing with highly unstructured and noisy point clouds acquired from an apple orchard. Overall, our proposed method achieves 86.2% mIoU on the segmentation of fruits on the high-resolution point cloud (100k–200k points). The experiment results show that the proposed method can perform accurate segmentation in real orchard environments.

Full Text