Abstract
The latest advances in deep learning technology make it possible to recognize vegetable diseases from leaf images. The existing disease recognition methods based on computer vision have shown exciting achievements in terms of accuracy, stability, and portability. However, these methods cannot provide a decision-making basis for the final results, and lack a text basis to support the users’ judgement. Disease diagnosis is a risky decision. If the detection method lacks transparency, the users will not be able to fully trust the recognition results, which greatly limits the application of various recognition methods based on deep learning. Aiming at the problem of low “man–machine” credibility due to the fact that deep learning-based methods are unable to provide decision-making basis, this paper proposed a two-stage image dense captioning model named “DFYOLOv5m-M2Transformer”, which can generate description sentences of visualized disease features on the basis of the recognized diseased area. Firstly, we established a target detection dataset and a dense captioning dataset containing leaf images of 10 diseases, involving 2 vegetables, i.e., cucumber and tomato. Secondly, we chose the DFYOLOv5m network as the disease detector to extract the diseased area from the image, and the M2-Transformer network as the decision basis generator to generate description sentences of disease features. Then, the Bi-Level Routing Attention module was introduced to extract fine-grained features under complex backgrounds in order to resolve the problem of poor feature extraction in case of mixed diseases. Finally, we used Atrous Convolution to expand the receptive field of the model, and fused NWD and CIoU to improve the model’s performance in detecting small targets. The experimental results show that the IoU and Meteor joint evaluation indicator of DFYOLOv5m-M2Transformer achieved a mean Average Precision (mAP) of 94.7 % on the dense captioning dataset, which was 7.2 % higher than that of the best-performing model Veg-DenseCap in the control group. Moreover, the decision basis that is automatically generated by the proposed model is characterized by the advantages of high accuracy, correct grammar and large sentence variety. The outcome of this study provides a new idea for optimizing the user experience in using vegetable disease recognition models.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.