Abstract
This paper deals with a project of Automatic Bird Species Recognition Based on Bird Vocalization. Eighteen bird species of 6 different families were analyzed. At first, human factor cepstral coefficients representing the given signal were calculated from particular recordings. In the next phase, using the voice activity detection system, segments of bird vocalizations were detected from which a likelihood rate, with which the given code value corresponds to the given model, was calculated using individual hidden Markov models. For each bird species, just one respective hidden Markov model was trained. The interspecific success of 81.2% has been reached. For classification into families, the success has reached 90.45%.
Highlights
1 Introduction When solving tasks of the bird vocalization automatic recognition, knowledge obtained during speech recognition research is the groundwork
As mentioned in the work [1] on human speech recognition, this is an interdisciplinary field in which findings from several scientific disciplines combine, such as physiology, acoustics, and signal processing
4 Conclusions In the above described experiment, we show the automatic classification of vocalization of 18 bird species using the voice activity detection (VAD) module for the detection of vocalization segments in recordings
Summary
When solving tasks of the bird vocalization automatic recognition, knowledge obtained during speech recognition research is the groundwork. The bird vocalization recognition and speech recognition are similar tasks to a large extent. In both of them, several basic problems need to be solved. As mentioned in the work [1] on human speech recognition, this is an interdisciplinary field in which findings from several scientific disciplines combine, such as physiology, acoustics, and signal processing. For the bird vocalization recognition, we use knowledge of the vocalization production process on the basis of the voice organ physiology. Besides a properly chosen parameterization method, in both the cases, we have to cope with a noise in recordings and with a variety of human speech, or bird vocalization
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: EURASIP Journal on Audio, Speech, and Music Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.