Abstract

Accurate pest classification plays a pivotal role in modern agriculture for effective pest management, ensuring crop health and productivity. While Convolutional Neural Networks (CNNs) have been widely used for classification, their limited ability to capture both local and global information hinders precise pest identification. In contrast, vision transformers have shown promise in capturing global dependencies and enhancing classification performance. However, the traditional attention mechanism employed in vision transformers, which uses the same query (Q), key (K), and value (V), overlooks spatial relationships between patches, limiting the model's capacity to capture fine-grained details and long-range dependencies in the image. To address these limitations, this study presents a novel approach, termed Hybrid Pooled Multihead Attention (HPMA), for superior pest classification that outperforms both CNN models and vision transformers. The HPMA model integrates hybrid pooling techniques and modifies the attention mechanism to effectively capture local and global features within images. By emphasizing discriminative features and suppressing irrelevant information, the HPMA model achieves heightened robustness and generalization capabilities. The model is trained and tested on a newly built dataset consisting of 10 pest classes, achieving a remarkable accuracy of 98 %. Furthermore, the proposed HPMA model is validated on two benchmark datasets and achieves accuracies of 98 % and 95 %, demonstrating its effectiveness across diverse pest datasets. The results and ablation study of the proposed model contribute to exceptional performance in accurate pest classification. This tackles agricultural pest challenges and enables prompt pest control to reduce crop losses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call