The intelligent transformation of crop leaf disease detection has driven the use of deep neural network algorithms to develop more accurate disease detection models. In resource-constrained environments, the deployment of crop leaf disease detection models on the cloud introduces challenges such as communication latency and privacy concerns. Edge AI devices offer lower communication latency and enhanced scalability. To achieve the efficient deployment of crop leaf disease detection models on edge AI devices, a dataset of 700 images depicting peanut leaf spot, scorch spot, and rust diseases was collected. The YOLOX-Tiny network was utilized to conduct deployment experiments with the peanut leaf disease detection model on the Jetson Nano B01. The experiments initially focused on three aspects of efficient deployment optimization: the fusion of rectified linear unit (ReLU) and convolution operations, the integration of Efficient Non-Maximum Suppression for TensorRT (EfficientNMS_TRT) to accelerate post-processing within the TensorRT model, and the conversion of model formats from number of samples, channels, height, width (NCHW) to number of samples, height, width, and channels (NHWC) in the TensorFlow Lite model. Additionally, experiments were conducted to compare the memory usage, power consumption, and inference latency between the two inference frameworks, as well as to evaluate the real-time video detection performance using DeepStream. The results demonstrate that the fusion of ReLU activation functions with convolution operations reduced the inference latency by 55.5% compared to the use of the Sigmoid linear unit (SiLU) activation alone. In the TensorRT model, the integration of the EfficientNMS_TRT module accelerated post-processing, leading to a reduction in the inference latency of 19.6% and an increase in the frames per second (FPS) of 20.4%. In the TensorFlow Lite model, conversion to the NHWC format decreased the model conversion time by 88.7% and reduced the inference latency by 32.3%. These three efficient deployment optimization methods effectively decreased the inference latency and enhanced the inference efficiency. Moreover, a comparison between the two frameworks revealed that TensorFlow Lite exhibited memory usage reductions of 15% to 20% and power consumption decreases of 15% to 25% compared to TensorRT. Additionally, TensorRT achieved inference latency reductions of 53.2% to 55.2% relative to TensorFlow Lite. Consequently, TensorRT is deemed suitable for tasks requiring strong real-time performance and low latency, whereas TensorFlow Lite is more appropriate for scenarios with constrained memory and power resources. Additionally, the integration of DeepStream and EfficientNMS_TRT was found to optimize memory and power utilization, thereby enhancing the speed of real-time video detection. A detection rate of 28.7 FPS was achieved at a resolution of 1280 × 720. These experiments validate the feasibility and advantages of deploying crop leaf disease detection models on edge AI devices.
Read full abstract