In recent years, many agriculture-related problems have been evaluated with the integration of artificial intelligence techniques and remote sensing systems. Specifically, in fruit detection problems, several recent works were developed using Deep Learning (DL) methods applied in images acquired in different acquisition levels. However, the increasing use of anti-hail plastic net cover in commercial orchards highlights the importance of terrestrial remote sensing systems. Apples are one of the most highly-challenging fruits to be detected in images, mainly because of the target occlusion problem occurrence. Additionally, the introduction of high-density apple tree orchards makes the identification of single fruits a real challenge. To support farmers to detect apple fruits efficiently, this paper presents an approach based on the Adaptive Training Sample Selection (ATSS) deep learning method applied to close-range and low-cost terrestrial RGB images. The correct identification supports apple production forecasting and gives local producers a better idea of forthcoming management practices. The main advantage of the ATSS method is that only the center point of the objects is labeled, which is much more practicable and realistic than bounding-box annotations in heavily dense fruit orchards. Additionally, we evaluated other object detection methods such as RetinaNet, Libra Regions with Convolutional Neural Network (R-CNN), Cascade R-CNN, Faster R-CNN, Feature Selective Anchor-Free (FSAF), and High-Resolution Network (HRNet). The study area is a highly-dense apple orchard consisting of Fuji Suprema apple fruits (Malus domestica Borkh) located in a smallholder farm in the state of Santa Catarina (southern Brazil). A total of 398 terrestrial images were taken nearly perpendicularly in front of the trees by a professional camera, assuring both a good vertical coverage of the apple trees in terms of heights and overlapping between picture frames. After, the high-resolution RGB images were divided into several patches for helping the detection of small and/or occluded apples. A total of 3119, 840, and 2010 patches were used for training, validation, and testing, respectively. Moreover, the proposed method’s generalization capability was assessed by applying simulated image corruptions to the test set images with different severity levels, including noise, blurs, weather, and digital processing. Experiments were also conducted by varying the bounding box size (80, 100, 120, 140, 160, and 180 pixels) in the image original for the proposed approach. Our results showed that the ATSS-based method slightly outperformed all other deep learning methods, between 2.4% and 0.3%. Also, we verified that the best result was obtained with a bounding box size of 160 × 160 pixels. The proposed method was robust regarding most of the corruption, except for snow, frost, and fog weather conditions. Finally, a benchmark of the reported dataset is also generated and publicly available.
Read full abstract