Abstract

Recent work [1,2] has shown that Convolutional Neural Networks (CNNs) trained on spectrograms of acoustic signals are capable of learning high-level latent representations for the purpose of detecting and classifying the vocalizations of endangered baleen whales. The aforementioned latent representations were used in the development of an automated system that was capable of detecting the vocalizations of blue, fin, and sei whales against non-biological and ambient noise sources to a high degree of accuracy (0.961, F-1 Score = 0.899). In this work, we conduct an exploratory analysis of the same latent representations using statistical machine learning approaches as well as by visualizing the convolutional feature maps learned by the CNN. Through this analysis we attempt to interpret what properties of a spectrogram are easily and/or most often exploited by the CNN during training in order to improve upon the state-of-the-art and develop more robust detection systems going forward. [1] M. Thomas, B. Martin, K. Kowarski, B. Gaudet, and S. Matwin, Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation, ECML PKDD 2019 (Springer, Cham, 2019). [2] M. Thomas, Towards a novel data representation for classifying acoustic signals, in Canadian Conference on Artificial Intelligence (Springer, Cham, 2019).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call