Deep whistle contour: Recall-guided learning from synthesis

Pu Li,Eva-Marie Nosal,Kaitlin Palmer,Douglas M Gillespie,Tyler A Helble,Holger Klinck,Danielle Cholewiak,Yu Shiu,Xiaobai Liu,Marie Roch,Erica Fleishman

doi:10.1121/1.5137332

Abstract

This paper presents a learning based method for detecting whistles of toothed whales from underwater hydrophone recordings. Our method represents audio signals as time-frequency spectrogram and employs the Fully Convolution Network (FCN) to estimate for each spectrogram a map of contour confidences that are used for extracting discrete whistle contours. To avoid the expensive efforts of annotating whistle contours, we develop a data synthesis approach to generate spectrogram-contour pairs using spectrogram of background environment and a small set of whistle contours. Our study suggests that the deep contour model can be effectively learned from these synthesized samples. However, it is costly and unnecessary to synthesize equal amount of samples for each spectrogram or contour. Instead, we present an alternative learning algorithm that synthesize samples only for those spectrogram or contours that are not well modeled by the current network, measured by recall rates of contour points for each spectrogram-contour sample. This recall-guided learning algorithm can adaptively synthesize difficult samples to boost learning effectiveness. We applied the proposed method to the public DCLDE2011 dataset to extract whistle contours. Results show that our method can improve state-of-the-art method up to 21.9% in terms of F-score for multiple odontocete species.

Full Text