Glottal inverse filtering is a noninvasive method for getting the glottal flow estimate from the speech. In this paper, we propose a method for glottal inverse filtering based on probabilistic weighted linear prediction PWLP in which the speech is assumed to be the output of an all-pole filter with glottal flow as an excitation. First, we introduce a probabilistic interpretation of the WLP, and we propose a probabilistic temporal weighting as convolution of a binary vector and a fixed window. We construct the posterior distribution based on the PWLP likelihood and a Gaussian prior on the filter coefficients. The parameters are estimated using the Gibbs sampling. The experiments are performed using the Lijencrants–Fant LF model based synthetic data, a physical model based synthetic data of different vowels and real speech data. Results demonstrate that the proposed method outperforms the best of the existing state-of-the-art methods in terms of the normalized amplitude quotient by 0.035 and 0.12 for the LF model and physical model based synthetic data, respectively. The results based on real speech data show that the glottal flow estimated by the proposed method in the closed phase is flatter and has less formant ripple compared to existing state-of-the-art methods. We also show two key features of the proposed method: first, the proposed method does not need prior detection of glottal closure or opening instants. The temporal weights are learnt in a data-driven manner, which is often found to be high near the closed phase of the glottal cycle, second, the Gaussian prior helps in estimating the filter coefficients when the closed phase duration is small.
Read full abstract