Abstract

Accurate counting of pears in orchard environments is essential for crop management. However, due to the time and labor cost of manual counting, farmers rely on sampling a few trees and extending the count to the entire orchard, which overlooks the spatial variability of fruit load. This study proposes automatic labeling in case of insufficient manually labeled data to accelerate the adaptation of on-tree fruit counting solutions based on deep learning. Since the amount of training data needed can be one of the main limitations to deploying a robust solution in the field, here we present an approach to minimize human intervention in data labeling. We describe, first, how the latest advances in self-supervised learning and transformers can be used to create an automatic labeling procedure, and second, how the automatically labeled dataset can be exploited despite its noise. For the dataset creation using automatic labeling, we modified a single object discovery algorithm to localize multiple object instances, and we trained an image classifier to include only objects from our target class: pears. The automatically generated labels are regarded as ‘noisy’ because the quality of the annotations is lower than manually annotated data. We then pre-trained an object detection model with the automatically labeled data (121,038 annotations) and fine-tuned it with insufficient manually labeled data (500, 1000, and 2500 annotations). This allowed us to leverage knowledge from both datasets despite the quality and size restrictions. The performance comparison of object detection models trained this way to those trained only with insufficient manually labeled data showed average precision gains of 37.0, 21.4, and 1.0%. In addition, on unseen images taken with the same camera setup, the average precision gains were 35.9, 17.6, and 1.8%. Furthermore, gains were 38.5, 25.1, and 1.2% on unseen images taken with a different camera setup. These results suggest the substantial value of automatic labeling when having insufficient training data for fruit detection in pear orchards. Moreover, other object detection applications can replicate the presented methods and experiments, especially when resource constraints or rapid validation prevent extensive manual labeling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call