Abstract

In the sparse imputation approach, missing spectral components of speech are estimated using the compressive sensing technique. For this purpose, a dictionary of clean speech components must be prepared. Noisy feature vectors are then reconstructed by the dictionary queries. In this approach, the dictionary elements should adequately cover all the possible varieties of the speech feature vectors; so, for a subject speech frame, there will be lots of irrelevant components inside the dictionary. These components make a huge size for the dictionary that in turn, slow down the estimation process and may produce artifacts in the final estimation. To face this problem, the current work proposes to cluster the dictionary queries in some smaller subspaces; the relevant subspace for a subject feature vector could be found through the posterior criterion. Moreover, this is shown that the likelihood of the Gaussian models developed for the subspaces could role as a regularization term and act as an extra prior knowledge in the estimation process and increases the final performance, significantly. To evaluate the benefits of the proposed methods, some well-designed ASR experiments are conducted on two different speech corpora, an English noisy connected digit database (Aurora 2) and a Persian continuous speech corpus (FARSDAT). The experiments show that the proposed methods not only increase the absolute word recognition accuracy but also make the entire process few times faster than the original sparse imputation approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call