Abstract

The speaker change information in speech is due to both vocal tract and excitation source information. In this work, the excitation source information is extracted by computing cepstral features from the zero frequency filtered speech (ZFFS) signal. The vocal tract system information is extracted by computing cepstral features from the speech signal. The speaker change evidences obtained from these two feature sets are combined and observed that they contain complementary information for speaker change detection. The popular distance metric based algorithms, Bayesian Information Criteria (BIC) and Kullback Leibler Divergence (KLD) are used to detect the speaker change evidences. The Miss Detection Rate (MDR) of BIC based algorithm using cepstral features obtained from speech is 24.18 % and from ZFFS is 25.92%, respectively. When the two sets of evidences are combined, the MDR reduces to 15.89%. Similarly, individual MDR of KLD based algorithm from speech and ZFFS are 32.24% and 45.17%, respectively, where as the combination reduces the MDR to 19.67%. Experiments are also performed with noisy speech signal and similar reduction of MDR is observed. This demonstrates the usefulness of cepstral features from the excitation source signal for reducing MDR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call