Abstract

The tomato picking robot's vision system faces two difficult tasks: precise tomato pose acquisition and stem location. Tomato pose and stem location can help determine the end effector pose and achieve collision-free picking. To realise efficient crop picking, the tasks of target location, pose detection, and obstacle semantic segmentation should be completed in one model to obtain comprehensive visual information. Therefore, the multitask convolutional neural network YOLO-MCNN is proposed, a new method to complete the above tasks in one model. By fusing multi-scale features and determining the optimal locations for the semantic segmentation branch, four strategies are proposed for enhancing the segmentation ability. The experiment results show that fusing the semantic segmentation branch with the second layer of shallow feature maps and placing the branch after the 17th layer can result in the best segmentation performance. Fusing shallow feature maps improves small target detection while merging multi-scale feature maps enhances semantic segmentation performance. Moreover, ablation experiments are conducted to understand the influence between multitask convolutional and single task networks. It proves that running multiple tasks on the same backbone network does not affect their performance. The YOLO-MCNN's target detection performance F1 is 87.8%, the semantic segmentation performance mIoU is 74.8%, the keypoint detection performance dlmk is 6.95 pixels, the network size is 15.2 MB, and the inference speed is 19.9ms. Compared with other target detection and semantic segmentation networks, it shows that the comprehensive performance of the YOLO-MCNN is the best. The method provides theoretical foundation for constructing multitask convolutional neural networks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.