Abstract

In smart agriculture, analyzing animal vocalizations provides a non-invasive, continuous monitoring approach that directly correlates with specific animal conditions, enhancing welfare. Pig vocalizations, in particular, are critical for managing farm events and improving animal welfare. Yet, traditional methods, mainly involving Convolutional Neural Networks (CNNs), focus on local audio features and complex combinations, struggling with varying audio lengths and high computational costs.Addressing these issues, this study introduces a novel approach with an Audio Spectrogram Transformer (AST), designed to detect abnormal pig vocalizations. Our method involves a two-stage process: segmenting the audio to retain only valuable information, and classifying through an attention mechanism that analyzes both nuanced and global audio features. This technique significantly improves accuracy and efficiency in vocalization analysis.Tested on 7600 real-world audio samples, our method demonstrated a significant improvement, achieving an accuracy of 93 % and enhancing inference speed by 19 times compared to existing CNN-based techniques. Additionally, we conducted interpretability analysis and feature selection experiments to evaluate the efficacy of different feature combinations. These experiments verified that our attention-based approach not only simplifies the input features but also provides superior performance over traditional models. The findings of this study underscore the potential of AST in transforming livestock welfare monitoring by offering a more accurate, efficient, and scalable solution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call