Wood-boring pests (WBPs) are among the most destructive pests to trees, yet their presence is difficult to detect due to their concealed lifestyles. Audio signals generated through the activities of WBPs can act as evidence for their existence. This paper presents a novel lightweight CNN (Convolutional Neural Network), named Mobile-Acoustic Net (MAnet) which features efficient structures and utilizes audio signals to detect the early presence of WBPs. A public benchmark dataset is expanded by combining pure sounds of WBPs and pure environment sounds. Two audio features, namely Mel Frequency Cepstral Coefficient (MFCC) and Linear Frequency Cepstral Coefficient (LFCC), were extracted from the original audios and served as inputs to MAnet. To better utilize these input features, a novel deep fusion module was designed to generate fusion features. To decrease the model computation, Blueprint Separable Convolutions were employed to replace the classical convolution operations. Efficient Channel Attention was added with minimal system cost to enhance model performance. The decoupled knowledge distillation was employed to train the trimmed model, resulting in a smaller version of MAnet that maintains similar recognition performance. Experiments have shown that MAnet achieved better performance and smaller size compared to other models, with an average accuracy of 96.136 % and a minimum computation of 8.811 M, demonstrating its effectiveness. MAnet can provide a new way of detection for WBPs, which contributes to forest protection and pest management.
Read full abstract