Abstract

An extensive, publicly available dataset is presented—the LAR Broccoli dataset—which contains 20,000 manually annotated images of broccoli heads captured from a moving tractor at an organic farm in the UK. The dataset contains images of the same row of broccoli heads recorded at 30 frames per second (fps) with three different cameras. Two off-the-shelf, relatively low-cost depth-sensing cameras were used, with the tractor moving at a speed of around 1 km/h, in addition to a webcam, with the tractor moving twice as fast. The utility of the dataset is demonstrated in four ways. First, three different state-of-the-art detector models were trained on the dataset, achieving an overall mean Average Precision (mAP) score of over 95% for the best-performing detector. The results validate the utility of the dataset for the standard task of in-field broccoli head recognition. Second, experiments with transfer learning were conducted, initialised with a smaller pre-trained broccoli detection model, and refined with the LAR Broccoli dataset. Third, we assessed the advantages of transfer learning not only using mAP but also according to time and space requirements for training models, which provides a proxy metric for energy efficiency, a practical consideration for real-world model training. Fourth, the cross-camera generalisation among the three camera systems was compared. The results highlight that testing and training detector models using different camera systems can lead to reduced performance, unless the training set also includes some images captured in the same manner as those in the test set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call