Abstract
This study presents a novel method to apply the RGB-D (Red Green Blue-Depth) sensors and fuse aligned RGB and NIR images with deep convolutional neural networks (CNN) for fruit detection. It aims to build a more accurate, faster, and more reliable fruit detection system, which is a vital element for fruit yield estimation and automated harvesting. Recent work in deep neural networks has led to the development of a state-of-the-art object detector termed Faster Region-based CNN (Faster R-CNN). A common Faster R-CNN network VGG16 was adopted through transfer learning, for the task of kiwifruit detection using imagery obtained from two modalities: RGB (red, green, blue) and Near-Infrared (NIR) images. Kinect v2 was used to take a bottom view of the kiwifruit canopy's NIR and RGB images. The NIR (1 channel) and RGB images (3 channels) were aligned and arranged side by side into a 6-channel image. The input layer of the VGG16 was modified to receive the 6-channel image. Two different fusion methods were used to extract features: Image-Fusion (fusion of the RGB and NIR images on input layer) and Feature-Fusion (fusion of feature maps of two VGG16 networks where the RGB and NIR images were input respectively). The improved networks were trained end-to-end using back-propagation and stochastic gradient descent techniques and compared to original VGG16 networks with RGB and NIR image input only. Results showed that the average precision (APs) of the original VGG16 with RGB and NIR image input only were 88.4% and 89.2% respectively, the 6-channel VGG16 using the Feature-Fusion method reached 90.5%, while that using the Image-Fusion method reached the highest AP of 90.7% and the fastest detection speed of 0.134 s/image. The results indicated that the proposed kiwifruit detection approach shows a potential for better fruit detection.
Highlights
China is the largest country producing kiwifruits worldwide, with a yield of approximately 2.4 million tons in 2016 from a cultivated area of 197,048 ha [1]
Zhan et al [26] employed RGB and NIR fusion algorithm to distinguish the chestnut quality based on back propagation network, of which the discriminating rate is improved by 3.75% and 6.25%, respectively, compared to using NIR and RGB image separately Abdelsalam and Sayed [27] extracted seven color components from RGB and NIR images of citrus and applied a voting process algorithm to detect citrus defects, of which the accuracy is more than 95%
The Hayward-Kiwi RGB-NIR-D dataset and the corresponding annotations is the first dataset for kiwifruit detection that contains aligned RGB, depth, and NIR images which has been made publicly available
Summary
China is the largest country producing kiwifruits worldwide, with a yield of approximately 2.4 million tons in 2016 from a cultivated area of 197,048 ha [1]. Liu et al.: Improved Kiwifruit Detection Using Pre-Trained VGG16 With RGB and NIR Information Fusion which makes them visible and accessible for manual picking [5] This canopy structure provides relatively simpler and structured workspace for mechanized or automated field operations such as robotic picking [3], [6], compared to other fruit trees such as apples [7]. Bai et al [29] present a new eddy detection approach of combining the multilayer features in the neural network with the characteristics of the eddies via deep neural networks to improve eddy detection accuracy, which results in mAP (mean Average Precision) of 90.6% These studies achieved higher accuracy using multi-modality information fusion. Two different fusion methods are studied: fusion of the RGB and NIR images on input layer of the Faster R-CNN and fusion of feature maps of two Faster R-CNN networks where the RGB and NIR images were input respectively
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have