With the development of smart agriculture, deep learning is playing an increasingly important role in crop disease recognition. The existing crop disease recognition models are mainly based on convolutional neural networks (CNN). Although traditional CNN models have excellent performance in modeling local relationships, it is difficult to extract global features. This study combines the advantages of CNN in extracting local disease information and vision transformer in obtaining global receptive fields to design a hybrid model called MSCVT. The model incorporates the multiscale self-attention module, which combines multiscale convolution and self-attention mechanisms and enables the fusion of local and global features at both the shallow and deep levels of the model. In addition, the model uses the inverted residual block to replace normal convolution to maintain a low number of parameters. To verify the validity and adaptability of MSCVT in the crop disease dataset, experiments were conducted in the PlantVillage dataset and the Apple Leaf Pathology dataset, and obtained results with recognition accuracies of 99.86% and 97.50%, respectively. In comparison with other CNN models, the proposed model achieved advanced performance in both cases. The experimental results show that MSCVT can obtain high recognition accuracy in crop disease recognition and shows excellent adaptability in multidisease recognition and small-scale disease recognition.
Read full abstract