Abstract
The passive acoustic monitoring of marine mammals is an essential tool for researchers tracking the populations of individual species in threatened environments. Given the large quantity of audio data generated by passive acoustic arrays, it is desirable to automate the process of identifying marine mammals present in the recordings. Utilizing acoustic data from the William A. Watkins Marine Mammal Sounds Database, we present an approach using residual learning networks (ResNets) for classifying the marine mammal vocalizations of up to 32 species. We first determine the optimal methods for converting acoustic recordings into discrete spectrograms suitable for input into neural networks. A series of configurations for spectrographic window functions, preprocessing augmentations, and multi-channel spectrogram generation are examined. Each configuration’s spectrographic output is used to train a residual learning network. Its multi-class classification performance is ranked using the harmonic mean of precision and recall to calculate a weighted F1-score. Configurations specifying 512×256 spectrograms created with a Hann window of 1024 and utilizing horizontal roll demonstrate superior performance. We use the top-performing configurations to generate training data as input for a series of single and multi-channel residual neural networks. These networks are trained to high precision before evaluating their multi-class classification performance. A single-channel network performed the best, obtaining an F1-score of 0.867 with an AUC of 0.9281 on a 32-class classification task. Our multi-channel configuration obtained an F1-score of 0.846 with an AUC of 0.9169. While we demonstrate that networks may learn more information from multi-channel spectrographic inputs, we find that single-channel spectrograms offer superior classification performance overall.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.