Three-dimensional (3D) surface reconstruction is crucial to accurately detecting navel orange external qualities such as size and shape, and binocular stereo vision is promising for practical sorting scenarios due to its high frame rates and ease of deployment. For objects with texture-less surface, such as navel oranges, it is very challenging to match the corresponding points using traditional stereo matching algorithms because the detected feature points have high similarity in color as well as structure. Most of the existing deep learning-based stereo matching algorithms focus on the stereo matching performance of the whole image and lack the attention to the matching performance of the main region. Meanwhile, simple background with less texture will bring more uncertainty to the matching of corresponding points, which affects the robustness of the model. The OrangeStereo was proposed based on the GwcNet structure to achieve a better corresponding points matching effect of navel oranges, which can provide more accurate 3D surface information. The Structural Features Extraction (SFE) module was introduced after the initial feature extraction module to extract stable structural features and avoid the limitations of using single convolutional features. The Attention Weights Generation (AWG) module was utilized at matching cost calculation stage to help suppress redundant information and make the network pay more attention to the matching performance of specific regions. The improved loss function assigned a greater weight to the loss of the navel orange region using semantic information from left image to make the network more focused on the matching effect of the navel orange region. An ablation study was conducted to confirm the efficiency of the SFE module, AWG module, and the improved loss function of the algorithm. The ablation study contained five sub-experiments, including the baseline model, adding the SFE and the AWG modules respectively, and whether to add the improved loss function based on the addition of previous two modules. Additionally, PSMNet, GwcNet, RealtimeStereo, and ACVNet, four typical stereo matching methods, were contrasted with the OrangeStereo. The proposed OrangeStereo achieved better performance with EPE, RMSE, bad-1, bad-3, bad-5, and inference time of 0.60 pixels, 3.53 pixels, 2.37 %, 1.83 %, 1.62 %, and 33 ms in the orange region, respectively. The R2 and RMSE for the depth of orange utilizing our method were 0.982 and 0.81 mm, respectively. The proposed algorithm can be employed to obtain accurate depth information of fruits quickly, and it is expected to be used in the 3D reconstruction of fruits in the actual commercial fruit sorting lines to achieve more accurate external quality assessment.
Read full abstract