Maize leaf area offers valuable insights into physiological processes, playing a critical role in breeding and guiding agricultural practices. The Azure Kinect DK possesses the real-time capability to capture and analyze the spatial structural features of crops. However, its further application in maize leaf area measurement is constrained by RGB–depth misalignment and limited sensitivity to detailed organ-level features. This study proposed a novel approach to address and optimize the limitations of the Azure Kinect DK through the multimodal coupling of RGB-D data for enhanced organ-level crop phenotyping. To correct RGB–depth misalignment, a unified recalibration method was developed to ensure accurate alignment between RGB and depth data. Furthermore, a semantic information-guided depth inpainting method was proposed, designed to repair void and flying pixels commonly observed in Azure Kinect DK outputs. The semantic information was extracted using a joint YOLOv11-SAM2 model, which utilizes supervised object recognition prompts and advanced visual large models to achieve precise RGB image semantic parsing with minimal manual input. An efficient pixel filter-based depth inpainting algorithm was then designed to inpaint void and flying pixels and restore consistent, high-confidence depth values within semantic regions. A validation of this approach through leaf area measurements in practical maize field applications—challenged by a limited workspace, constrained viewpoints, and environmental variability—demonstrated near-laboratory precision, achieving an MAPE of 6.549%, RMSE of 4.114 cm², MAE of 2.980 cm², and R² of 0.976 across 60 maize leaf samples. By focusing processing efforts on the image level rather than directly on 3D point clouds, this approach markedly enhanced both efficiency and accuracy with the sufficient utilization of the Azure Kinect DK, making it a promising solution for high-throughput 3D crop phenotyping.
Read full abstract