Abstract

Rapid decreases in frog populations have been spotted worldwide, which are regarded as one of the most critical threats to the global biodiversity. Recent advances in acoustic sensors provide a novel way to assess frog vocalizations and further optimize the global protection policy. Specifically, frog populations can be reflected by detecting frog species using collected recordings. Previous studies have explored various acoustic features for classifying frog calls. However, few studies investigate visual features for frog call classification, which have been successfully used in acoustic event detection, speech/speaker recognition. In this study, various acoustic and visual features are proposed for frog call classification: MPEG-7 audio descriptor, syllable duration, oscillation rate, entropy related features, linear prediction codings, Mel-frequency Cepstral coefficients, local binary patterns, and histogram of oriented gradients. After segmenting continuous frog calls into individual syllables, different constructed feature sets are evaluated with a k-nearest neighbor classifier and support vector machines. Comprehensive results on 16 frog species demonstrate the effectiveness of both acoustic and visual features for classifying frog calls.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call