Abstract

Frogs are often considered as excellent indicators of the overall state of the natural environment, but a steady decrease in the frog population has been noticed worldwide. To monitor this change of frog population and optimise the protection policy, frog call classification has become an important bioacoustic research topic. However, automatic acoustic classification of frog calls has not been adequately addressed in the literature. In this paper, an enhanced feature representation for frog call classification using the temporal, perceptual and cepstral features is presented. With the enhanced feature representation, the time-frequency information of frog calls can be effectively represented, which gives a good classification performance. To be specific, each continuous frog recording is first segmented into individual syllables using the Ha¨rma¨’s method. Then, temporal, perceptual, and cepstral features are calculated from each syllable: syllable duration, Shannon entropy, Rényi entropy, zero-crossing rate, averaged energy, oscillation rate, spectral centroid, spectral flatness, spectral roll-off, signal bandwidth, spectral flux, fundamental frequency, linear predictive coding, and Mel-frequency cepstral coefficients. Next, different feature vectors are fused to obtain different enhanced feature representations. Finally, different enhanced feature representations are compared using five machine learning algorithms: linear discriminant analysis, K-nearest neighbour, support vector machines, random forest, and artificial neural network. Experiment results show that our proposed feature representation could achieve better classification performance comparing to other methods with twenty-four frog species, which are geographically well distributed throughout Queensland, Australia.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call