Abstract
Voice activity detection (VAD) refers to the task of identifying vocal segments from an audio clip. It helps in reducing the computational overhead as well elevate the recognition performance of speech-based systems by helping to discard the non vocal portions from an input signal. In this paper, a VAD technique is presented that uses line spectral frequency-based statistical features namely LSF-S coupled with extreme learning-based classification. The experiments were performed on a database of more than 350 h consisting of data from multifarious sources. We have obtained an encouraging overall accuracy of 99.43%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have