Abstract

Accurate estimation of glottal closure instant (GCI) enables several pitch synchronous speech analysis, such as prosody modifications, glottal inverse filtering, and study of pathological speech. We propose a probabilistic source-filter model (PSFM) for voiced speech, where the source is modeled using the Bernoulli Gaussian distribution, which models the GCI locations and the all-pole filter coefficients are modeled using Gaussian distribution. The probability of GCIs at each speech sample is estimated using the Gibbs sampling. We propose a cost to estimate the exact GCI locations using the N-best dynamic programming. A key feature of the proposed PSFM is that it allows us to include the second-order statistics of the noise for estimating the GCI locations, thereby resulting in a noise robust GCI detection technique, although it has high computational complexity. Evaluation on archivable priority list actual-word database (APLAWD) database shows the proposed algorithm performs at par with the state-of-the-art GCI detection method on clean speech. However, when evaluated in noisy conditions using five types of noises at six different signal-to-noise ratio (SNR) levels, we observe that the proposed method performs better than the best of the existing GCI detection scheme, particularly at low SNR condition indicating the noise robustness of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.