Abstract

In the recent years, the trend of speech signal processing becomes one of the most challenging fields. Now a days, several voice based biometry authentication systems are available to comply the requirements of our daily needs. The overall performances of these systems are totally based on the complexity of each part of the algorithm. Voice Activity Detector (VAD) is one of such relevant part. It can be designed in time domain as well as frequency domain. Time domain VAD have low complexity than the frequency domain or, other domain, like, wavelet transform domain based VAD. Normally, VAD is used to remove the silence portions from the speech signal and a suitable VAD must have high speech activity detection rate (HR1) and high non-speech detection rate (HR0). HP1 and HP0 are quantitative measurement of VAD accuracy to enumerate the speech, and silence part correctly from a speech signal. In this paper, we consider only time domain VAD for clean speech but, the existing time domain based VAD algorithms are not suitable to meet both of them simultaneously. Therefore, we design an unsupervised VAD algorithm which has high HR1, as well as, high HR0. The main advantage of unsupervised data segmentation is that no prior threshold information is required. Here, we use K-mean as an unsupervised clustering technique. In this paper, we compare different time domain voice activity detectors with our proposed VAD which we call it KVD(K-mean Voice-activity Detector). The results of that comparison show the highest performance of our proposed VAD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call