Abstract

Highlights Deep learning-based instance segmentation models were applied and evaluated for tomato fruit detection. Mask R-CNN with vision transformer backbone showed the highest accuracy for tomato instance detection. Size and weight estimation indexes were calculated using tomato region depth data from instance segmentation models. Area-based index has higher accuracy for weight estimation than indexes based on weight and height information. Abstract. The size and weight of fruits are crucial factors in yield prediction and determining harvesting time. Machine vision, including fruit detection, is a key technology in the automated monitoring and harvesting of fruits. In particular, deep learning-based fruit-detection methods have been actively applied. Estimation of fruit size after fruit detection requires depth information, which can be acquired using depth imaging. RGB-D cameras include color and depth information required for fruit detection and size estimation. In this study, the RGB-D imaging technique was used to estimate the size and weight of tomatoes. Furthermore, deep learning-based instance segmentation models, including Mask R-CNN, YOLACT, and RTMDet for tomato fruit detection, were trained and evaluated. The proposed method estimated the fruit width with a root mean square error (RMSE) of 4 mm, a mean absolute percentage error (MAPE) of 4.28%, and a fruit height with an RMSE of 5.12 mm and a MAPE of 6.42%. Furthermore, the weight-prediction model based on the area index estimated the tomato fruit weight with an RMSE of 19.69 g and MAPE of 9.44%. Thus, the method can be used for accurate size and weight estimation and can be applied in growth monitoring and automated tomatoes harvesting. Keywords: Deep learning, Fruit sizing, Instance segmentation, RGB-D, Tomato.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call