Pest-YOLO: A YOLOv5-Based Lightweight Crop Pest Detection Algorithm

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Traditional crop pest detection methods face the challenge of numerous parameters and computations, making it difficult to deploy on embedded devices with limited resources. Consequently, a lightweight network is an effective solution to this issue. Based on you only look once (YOLO)v5, this paper aims to design and validate a lightweight and effective pest detector called pest-YOLO. First, a random background augmentation method is proposed to reduce the prediction error rate. Furthermore, a MobileNetV3-light backbone replaces the YOLOv5n backbone to reduce parameters and computations. Finally, the Convolutional Block Attention Module (CBAM) is integrated into the new network to compensate for the reduction in accuracy. Compared to the YOLOv5n model, the pest-YOLO model’s Parameters and Giga Floating Point Operations (GFLOPs) decrease by about 33% and 52.5% significantly, and the Frames per Second (FPS) increase by approximately 11.1%. In contrast, the Mean Average Precision (mAP50) slightly declines by 2.4%, from 92.7% to 90.3%.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.3390/app14051869
YOLOv5-Sewer: Lightweight Sewer Defect Detection Model
  • Feb 24, 2024
  • Applied Sciences
  • Xingliang Zhao + 3 more

In the field of defect detection in sewers, some researches focus on high accuracy. However, it is challenging for portable on-site devices to provide high performance. This paper proposes a lightweight sewer defect detection model, You Only Look Once (YOLO) v5-Sewer. Firstly, the backbone network of YOLOv5s is replaced with a stacked MobileNetV3 block. Secondly, the C3 module of the neck of YOLOv5s is improved with a C3-Faster module. Thirdly, to compensate for the accuracy loss due to the lightweight network, a channel attention (CA) and convolutional block attention module (CBAM) are added to the proposed method. Finally, the Efficient Intersection over Union (EIOU) is adopted as the localization loss function. Experimental validation on the dataset shows that YOLOv5-Sewer achieves a 1.5% reduction in mean Average Precision (mAP) while reducing floating-point operations by 68%, the number of parameters by 55%, and the model size by 54%, compared to the YOLOv5s model. The detection speed reaches 112 frames per second (FPS) with the GPU (RTX 3070Ti). This model successfully implements a lightweight design while maintaining the detection accuracy, enhancing its functionality on low-performance devices.

  • Research Article
  • Cite Count Icon 35
  • 10.1049/cvi2.12072
TRC‐YOLO: A real‐time detection method for lightweight targets based on mobile devices
  • Oct 9, 2021
  • IET Computer Vision
  • Guanbo Wang + 5 more

Object detection is one of the main tasks of computer vision. Object detection algorithms usually rely on deep convolutional neural networks, which require the host device to have high computing capabilities, greatly limiting the application of object detection methods for mobile devices with limited computing capabilities, such as embedded devices. Among the current object detection algorithms, the you only look once (YOLO) series takes both speed and accuracy into consideration and is one of the most commonly used methods for object detection. In this article, TRC‐YOLO is proposed, which improves the mean average precision (mAP) and real‐time detection speed of the model while reducing the size of the model. In TRC‐YOLO, the convolution kernel of YOLO v4‐tiny is pruned and an expansive convolution layer is introduced into the residual module of the network to produce an hourglass Cross Stage Partial ResNet (CSPResNet) structure. A receptive field block (RFB) that simulates human vision is also added, increasing the receptive field of the model and strengthening the feature extraction ability of the network. In addition, the convolutional block attention module is applied, which combines spatial attention and channel attention, to enhance the effective features of the model and reduce the negative impact of noise on the model. The size of the TRC‐YOLO model is 17.8 MB, which is 5.9 MB smaller than YOLO v4‐tiny, and the model parameter is 2.983 billion floating point operations per second (BFLOP/s) (3.834 BFLOP/s less than YOLO v4‐tiny). In addition, TRC‐YOLO achieves a real‐time performance of 36.9 frames per second on a Jetson Xavier NX, and its mAP on the PASCAL VOC dataset is 66.4 (3.83 higher than YOLO v4‐tiny). In addition, the mAP of TRC‐YOLO on the MS COCO dataset is 37.7, which is 1.9 higher than that of the baseline model.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 21
  • 10.3390/agronomy13082139
Deep-Learning-Based Rice Disease and Insect Pest Detection on a Mobile Phone
  • Aug 15, 2023
  • Agronomy
  • Jizhong Deng + 8 more

The realization that mobile phones can detect rice diseases and insect pests not only solves the problems of low efficiency and poor accuracy from manually detection and reporting, but it also helps farmers detect and control them in the field in a timely fashion, thereby ensuring the quality of rice grains. This study examined two Improved detection models for the detection of six high-frequency diseases and insect pests. These models were the Improved You Only Look Once (YOLO)v5s and YOLOv7-tiny based on their lightweight object detection networks. The Improved YOLOv5s was introduced with the Ghost module to reduce computation and optimize the model structure, and the Improved YOLOv7-tiny was introduced with the Convolutional Block Attention Module (CBAM) and SIoU to improve model learning ability and accuracy. First, we evaluated and analyzed the detection accuracy and operational efficiency of the models. Then we deployed two proposed methods to a mobile phone. We also designed an application to further verify their practicality for detecting rice diseases and insect pests. The results showed that Improved YOLOv5s achieved the highest F1-Score of 0.931, 0.961 in mean average precision (mAP) (0.5), and 0.648 in mAP (0.5:0.9). It also reduced network parameters, model size, and the floating point operations per second (FLOPs) by 47.5, 45.7, and 48.7%, respectively. Furthermore, it increased the model inference speed by 38.6% compared with the original YOLOv5s model. Improved YOLOv7-tiny outperformed the original YOLOv7-tiny in detection accuracy, which was second only to Improved YOLOv5s. The probability heat maps of the detection results showed that Improved YOLOv5s performed better in detecting large target areas of rice diseases and insect pests, while Improved YOLOv7-tiny was more accurate in small target areas. On the mobile phone platform, the precision and recall of Improved YOLOv5s under FP16 accuracy were 0.925 and 0.939, and the inference speed was 374 ms/frame, which was superior to Improved YOLOv7-tiny. Both of the proposed improved models realized accurate identification of rice diseases and insect pests. Moreover, the constructed mobile phone application based on the improved detection models provided a reference for realizing fast and efficient field diagnoses.

  • Research Article
  • Cite Count Icon 1
  • 10.1049/2024/9927636
An Improved Lightweight YOLO Algorithm for Recognition of GPS Interference Signals in Civil Aviation
  • Jan 1, 2024
  • IET Signal Processing
  • Mian Zhong + 7 more

Considering several sources that cause global position system (GPS) interference in civil aviation and the challenges faced by interference recognition algorithms in terms of efficiency and accuracy, we propose an improved You Only Look Once (YOLO)v7‐CHS algorithm (YOLOv7‐CHS) and investigate its effectiveness in identifying GPS signals and different types of interference signals. First, continuous wavelet transform (CWT) is introduced as a method for processing and analyzing signals in the time–frequency (TF) domain to effectively obtain their temporal and spectral characteristic information. Second, the ConvNeXt structure is integrated into the YOLOv7 backbone network to create a ConvNeXtBlock (CNeB) module to enhance the classification and recognition accuracy of interference signals. Additionally, an attention mechanism is introduced to further improve model recognition accuracy. To effectively improve the capability of signal feature extraction and mitigate the impact of background noise on TF feature suppression, we have integrated the efficient channel attention (ECA) channel attention module with the convolutional block attention module (CBAM) spatial attention module, thereby proposing a hybrid CBAM and ECA (HCE) attention module. Last, to address issues arising from accidental deletion of detection frames and multipath interference negatively affecting model recognition performance, we have employed the soft nonmaximum suppression (Soft‐NMS) algorithm while selecting an optimal loss function through comparative analysis. The comparative evaluation experimental results under different circumstances show that YOLOv7‐CHS achieves recognition accuracies of 98.0% and 99.6% for various types of signals, respectively. These values represent an increase of 1.7% and 1%, respectively, compared to YOLOv7. Moreover, in terms of lightweight indicators, YOLOv7‐CHS exhibits a significant improvement in performance: the frames per second (FPS) is increased by 75.1, the number of parameters (Params) was reduced by 4.75 M, and giga floating point operations per second (GFLOPs) were reduced by 65.9 G while effectively enhancing recognition capabilities. The proposed YOLOv7‐CHS not only improves signal recognition accuracy but also reduces model Params and computational complexity, achieving a lightweight model with promising application prospects in the rapid detection and recognition of GPS interference sources in civil aviation.

  • Research Article
  • 10.11591/ijeecs.v38.i3.pp1765-1781
CGDE-YOLOv5n: a real-time safety helmet-wearing detection algorithm
  • Jun 1, 2025
  • Indonesian Journal of Electrical Engineering and Computer Science
  • Wanbo Luo + 3 more

Due to numerous parameters and calculations, existing safety helmetwearing detection models are challenging to deploy on embedded devices. Therefore, this paper proposed a you only look once (YOLO) v5n-based lightweight detection algorithm called CGDE-YOLOv5n to address the shortcomings in the following areas: (i) the YOLOv5n algorithm was selected to minimize the model’s parameters and calculations, reducing the hardware cost. (ii) The convolutional block attention module (CBAM) was integrated into the backbone to enhance the network’s feature extraction capability. (iii) The neck was improved using the efficient re-parameterized generalized feature pyramid network (efficient RepGFPN) to enhance the multi-scale object detection capability. (iv) The C3 module was improved using the deformable ConvNets v2 (DCNv2) module to enhance the network’s adaptability to geometric changes of objects. (v) The complete intersection over union (CIoU) loss was replaced with focal-efficient IoU (focal-EIoU) loss to reduce the missed detection rate. Experimental results demonstrated that the customized gradient descent estimation (CGDE)- YOLOv5n achieved a mean average precision (mAP) 50 of 89.5% and recall of 84%, which is 1% and 0.8% higher than the YOLOv5n. In particular, the recall of workers not wearing safety helmets increased by 1.7%. Furthermore, the improved model achieved a detection speed of 68.5 frames per second (FPS), meeting the real-time requirements.

  • Research Article
  • 10.52783/jes.1087
Aircraft Objection Detection Method of Airport Surface based on Improved YOLOv5
  • Apr 18, 2024
  • Journal of Electrical Systems
  • Rui Zhou,

An aircraft object detection method on the basis of improved YOLOv5 was proposed to address the issues of large model size, high number of parameters, and inability to meet real-time monitoring requirements of aircrafts in traditional object detection. Firstly, the basic unit of ShuffleNetv2 network was optimized through replacing 3x3 convolution with 5x5 convolution and removing subsequent 1x1 convolution. Simultaneously, the original ReLU activation function was replaced with PReLU. Secondly, CBAM (Convolutional Block Attention Module) attention mechanism was developed to enhance the detection accuracy of the improved network. Finally, improved ShuffleNetv2 network was applied as the backbone structure of YOLOv5. Experimental results revealed that the parameter number of the improved YOLOv5 method introduced in this paper was decreased by 18 times, with a model size of 1.03M. Therefore, a 20.8% increase was achieved in frames per second (FPS) in GPU environments and a 234.6% increase was observed in FPS in CPU environments, while a mean average precision (mAP@0.5) of 0.99 was maintained compared with traditional YOLOv5 network. Because of the advantages of fewer parameters, faster recognition speed, higher localization accuracy, and smaller memory requirement, the developed method was found to be suitable for real-time monitoring of aircrafts in airport surface.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-981-99-0189-0_4
A Lightweight Network for Detecting Pedestrians in Hazy Weather
  • Jan 1, 2023
  • Balaram Murthy Chintakindi + 1 more

Most of the existing detection models fails to detect pedestrians in hazy weather conditions. Therefore, to improve the safety of the driver in semi-autonomous or autonomous vehicles, a lightweight network has been proposed which can detect pedestrians effectively in hazy weather. A lightweight network YOU-ONLY-LOOK-ONCE-v2 (YOLOv2) + MobileNetv2 + Convolutional Block Attention Module (CBAM) was proposed. To build more efficient and faster model, YOLOv2 is employed, and to reduce both number of parameters and computational complexity, we adopt MobileNetv2 as our backbone model. In the proposed model, bounding box loss error is optimized by applying normalization and we also introduced attention module (CBAM) to improve the detection accuracy. Prior to the model training, we applied K-means clustering algorithm to figure out optimal number of prior anchor boxes in our dataset. Experimental results show that the proposed network achieves 87.4% average precision and detection accuracy on hazy person dataset and runs with 173.6 frames per second (FPS).

  • Research Article
  • 10.32629/jai.v7i2.1255
Hardhat-wearing detection based on YOLOv5 in Internet-of-Things
  • Dec 22, 2023
  • Journal of Autonomous Intelligence
  • Wanbo Luo + 3 more

<p>Worker safety is paramount in many industries. An essential component of industrial safety protocols involves the proper use of hardhats. However, due to lax safety awareness, many workers neglect to wear hardhats correctly, leading to frequent on-site accidents in China. Traditional detection methods, such as manual inspection and video surveillance, are inefficient and costly. Real-time monitoring of hardhat use is vital to boost compliance with hardhat usage and decrease accident rates. Recently, the advancement of the Internet of Things (IoT) and edge computing has provided an opportunity to improve these methods. In this study, two detection models based on You Only Look Once (YOLO) v5, hardhat-YOLOv5s and hardhat-YOLOv5n, were designed, validated, and implemented, tailored for hardhat detection. First, a public hardhat dataset was enriched to bolster the detection model’s robustness. Then, hardhat detection models were trained using the YOLOv5s and YOLOv5n, each catering to edge computing terminals with varying performance capacities. Finally, the models were validated using image and video data. The experimental results indicated that both models provided high detection precision and satisfied practical application needs. On the augmented public dataset, the hardhat-YOLOv5s and hardhat-YOLOv5n models have a Mean Average Precision (mAP) of 87.9% and 85.5%, respectively, for all six classes. Compared with the hardhat-YOLOv5s model, Parameters and Giga Floating-point Operations (GFLOPs) of the hardhat-YOLOv5n model decrease by 74.8% and 73.4%, respectively, and Frame per Second (FPS) increases by 30.5% on the validation dataset, which is more suitable for low-cost edge computing terminals with less computational power.</p>

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.3390/rs15153770
YOLOv7-MA: Improved YOLOv7-Based Wheat Head Detection and Counting
  • Jul 29, 2023
  • Remote Sensing
  • Xiaopeng Meng + 5 more

Detection and counting of wheat heads are crucial for wheat yield estimation. To address the issues of overlapping and small volumes of wheat heads on complex backgrounds, this paper proposes the YOLOv7-MA model. By introducing micro-scale detection layers and the convolutional block attention module, the model enhances the target information of wheat heads and weakens the background information, thereby strengthening its ability to detect small wheat heads and improving the detection performance. Experimental results indicate that after being trained and tested on the Global Wheat Head Dataset 2021, the YOLOv7-MA model achieves a mean average precision (MAP) of 93.86% with a detection speed of 35.93 frames per second (FPS), outperforming Faster-RCNN, YOLOv5, YOLOX, and YOLOv7 models. Meanwhile, when tested under the three conditions of low illumination, blur, and occlusion, the coefficient of determination (R2) of YOLOv7-MA is respectively 0.9895, 0.9872, and 0.9882, and the correlation between the predicted wheat head number and the manual counting result is stronger than others. In addition, when the YOLOv7-MA model is transferred to field-collected wheat head datasets, it maintains high performance with MAP in maturity and filling stages of 93.33% and 93.03%, respectively, and R2 values of 0.9632 and 0.9155, respectively, demonstrating better performance in the maturity stage. Overall, YOLOv7-MA has achieved accurate identification and counting of wheat heads in complex field backgrounds. In the future, its application with unmanned aerial vehicles (UAVs) can provide technical support for large-scale wheat yield estimation in the field.

  • Research Article
  • Cite Count Icon 46
  • 10.3389/fpls.2022.927424
MGA-YOLO: A lightweight one-stage network for apple leaf disease detection.
  • Aug 22, 2022
  • Frontiers in Plant Science
  • Yiwen Wang + 2 more

Apple leaf diseases seriously damage the yield and quality of apples. Current apple leaf disease diagnosis methods primarily rely on human visual inspection, which often results in low efficiency and insufficient accuracy. Many computer vision algorithms have been proposed to diagnose apple leaf diseases, but most of them are designed to run on high-performance GPUs. This potentially limits their application in the field, in which mobile devices are expected to be used to perform computer vision-based disease diagnosis on the spot. In this paper, we propose a lightweight one-stage network, called the Mobile Ghost Attention YOLO network (MGA-YOLO), which enables real-time diagnosis of apple leaf diseases on mobile devices. We also built a dataset, called the Apple Leaf Disease Object Detection dataset (ALDOD), that contains 8,838 images of healthy and infected apple leaves with complex backgrounds, collected from existing public datasets. In our proposed model, we replaced the ordinary convolution with the Ghost module to significantly reduce the number of parameters and floating point operations (FLOPs) due to cheap operations of the Ghost module. We then constructed the Mobile Inverted Residual Bottleneck Convolution and integrated the Convolutional Block Attention Module (CBAM) into the YOLO network to improve its performance on feature extraction. Finally, an extra prediction head was added to detect extra large objects. We tested our method on the ALDOD testing set. Results showed that our method outperformed other state-of-the-art methods with the highest mAP of 89.3%, the smallest model size of only 10.34 MB and the highest frames per second (FPS) of 84.1 on the GPU server. The proposed model was also tested on a mobile phone, which achieved 12.5 FPS. In addition, by applying image augmentation techniques on the dataset, mAP of our method was further improved to 94.0%. These results suggest that our model can accurately and efficiently detect apple leaf diseases and can be used for real-time detection of apple leaf diseases on mobile devices.

  • Research Article
  • 10.1016/j.fuel.2024.132899
Recognition of dispersed organic matter macerals using YOLOv5m model with convolutional block attention module
  • Aug 25, 2024
  • Fuel
  • Yuanzhe Wu + 6 more

Recognition of dispersed organic matter macerals using YOLOv5m model with convolutional block attention module

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 95
  • 10.3390/agriculture12060856
A Real-Time Apple Targets Detection Method for Picking Robot Based on ShufflenetV2-YOLOX
  • Jun 13, 2022
  • Agriculture
  • Wei Ji + 3 more

In order to enable the picking robot to detect and locate apples quickly and accurately in the orchard natural environment, we propose an apple object detection method based on Shufflenetv2-YOLOX. This method takes YOLOX-Tiny as the baseline and uses the lightweight network Shufflenetv2 added with the convolutional block attention module (CBAM) as the backbone. An adaptive spatial feature fusion (ASFF) module is added to the PANet network to improve the detection accuracy, and only two extraction layers are used to simplify the network structure. The average precision (AP), precision, recall, and F1 of the trained network under the verification set are 96.76%, 95.62%, 93.75%, and 0.95, respectively, and the detection speed reaches 65 frames per second (FPS). The test results show that the AP value of Shufflenetv2-YOLOX is increased by 6.24% compared with YOLOX-Tiny, and the detection speed is increased by 18%. At the same time, it has a better detection effect and speed than the advanced lightweight networks YOLOv5-s, Efficientdet-d0, YOLOv4-Tiny, and Mobilenet-YOLOv4-Lite. Meanwhile, the half-precision floating-point (FP16) accuracy model on the embedded device Jetson Nano with TensorRT acceleration can reach 26.3 FPS. This method can provide an effective solution for the vision system of the apple picking robot.

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.compag.2024.109388
A detection method for dead caged hens based on improved YOLOv7
  • Aug 31, 2024
  • Computers and Electronics in Agriculture
  • Jikang Yang + 5 more

A detection method for dead caged hens based on improved YOLOv7

  • Research Article
  • Cite Count Icon 37
  • 10.1109/tim.2022.3219468
YOLO-Former: Marrying YOLO and Transformer for Foreign Object Detection
  • Jan 1, 2022
  • IEEE Transactions on Instrumentation and Measurement
  • Yuan Dai + 4 more

The automatic detection of foreign objects between platform screen doors (PSDs) and metro train doors significantly affects personnel and property safety and maintains the train’s normal operation. However, some existing works only determine the presence of foreign objects but cannot indicate their categories. Besides, although deep-learning-based object detection algorithms can indicate the presence and categories of foreign objects, most of them only harness the information in region proposals, ignoring global contextual information. Furthermore, their performance comes at the considerable cost of computational complexity, and leading cannot be well deployed in the metro environment. To address these issues and better implement foreign object detection (FOD), we present You Only Look Once-Transformer (YOLO-Former), a simple but efficient model. YOLO-Former is accomplished based on YOLOv5 through the following procedure. First, the vision transformer (ViT) is introduced for dynamic attention and global modeling, thereby solving the problem that the original YOLOv5 only utilizes information in region proposals and has insufficient ability to capture global information. Second, the convolutional block attention module (CBAM) and Stem module are used to improve feature expression ability further and reduce floating point operations (FLOPs). Finally, we design various variants with different widths and depths to meet every need. Experiments on the foreign object detection dataset (FODD) and PASCAL VOC dataset demonstrate that YOLO-Former-x consistently outperforms other state-of-the-arts with significant margins (0.5 to 11.3 mean average precision, mAP, on FODD and 0.6 to 13.6 on PASCAL VOC dataset). Last but not least, YOLO-Former-x maintains real-time processing speed (27.32 and 28.17 frame per second, FPS, on TITAN Xp).

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 11
  • 10.3389/fpls.2023.1265025
A tree species classification model based on improved YOLOv7 for shelterbelts.
  • Jan 18, 2024
  • Frontiers in Plant Science
  • Yihao Liu + 5 more

Tree species classification within shelterbelts is crucial for shelterbelt management. The large-scale satellite-based and low-altitude drone-based approaches serve as powerful tools for forest monitoring, especially in tree species classification. However, these methods face challenges in distinguishing individual tree species within complex backgrounds. Additionally, the mixed growth of trees within protective forest suffers from similar crown size among different tree species. The complex background of the shelterbelts negatively impacts the accuracy of tree species classification. The You Only Look Once (YOLO) algorithm is widely used in the field of agriculture and forestry, ie., plant and fruit identification, pest and disease detection, and tree species classification in forestry. We proposed a YOLOv7-Kmeans++_CoordConv_CBAM (YOLOv7-KCC) model for tree species classification based on drone RGB remote sensing images. Firstly, we constructed a dataset for tree species in shelterbelts and adopted data augmentation methods to mitigate overfitting due to limited training data. Secondly, the K-means++ algorithm was employed to cluster anchor boxes in the dataset. Furthermore, to enhance the YOLOv7 backbone network's Efficient Layer Aggregation Network (ELAN) module, we used Coordinate Convolution (CoordConv) replaced the ordinary 1×1 convolution. The Convolutional Block Attention Module (CBAM) was integrated into the Path Aggregation Network (PANet) structure to facilitate multiscale feature extraction and fusion, allowing the network to better capture and utilize crucial feature information. Experimental results showed that the YOLOv7-KCC model achieves a mean average precision@0.5 of 98.91%, outperforming the Faster RCNN-VGG16, Faster RCNN-Resnet50, SSD, YOLOv4, and YOLOv7 models by 5.71%, 11.75%, 5.97%, 7.86%, and 3.69%, respectively. The GFlops and Parameter values of the YOLOv7-KCC model stand at 105.07G and 143.7MB, representing an almost 5.6% increase in F1 metrics compared to YOLOv7. Therefore, the proposed YOLOv7-KCC model can effectively classify shelterbelt tree species, providing a scientific theoretical basis for shelterbelt management in Northwest China focusing on Xinjiang.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon