Abstract
As part of a long-term bioacoustics monitoring project, audio data containing both anthrophony and biophony was collected 24/7 in a residential area of upstate NY for ten months of the 2019–2020 year. To analyze the ecological content of the data with as little manual intervention as possible, the data is automatically classified using deep learning techniques. First, the data is segmented and fed through a binary CNN long short-term memory network to separate “signal” from “silence.” Next, a small subset of the dataset is manually annotated via visual inspection of log-mel spectrograms to train a multiclass CNN-LSTM—a method which reaches testing accuracies of over 90%. Algorithm performance on this manually annotated dataset is compared to performance on unabridged, “real world” audio data, and strategies to handle issues such as lack of training data, multi-label classification, and the “none of the above” class are also explored. The classification results are ultimately used to generate long-term seasonal sound maps which are cross-referenced with local weather data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.