Abstract

A decomposition of signal into a set of frequency channels of equal bandwidth on a logarithmic scale, i.e., an analysis of the signal using constant Q filters, using wavelet and multiresolution analysis is used in this paper to derive the cepstral features for separated spatial frequency bands. Not like filter banking analysis, wavelet analysis decomposes signals into orthogonal spatial frequency bands, i.e., the overlap between two neighbor frequency bands is very small. Based on this property, channel weight can definitely be set to each frequency channel to increase the discriminability to distinguish between two signals. The recognition rate can then be improved. We use a Bayesian network to model each channel and propose an algorithm to give the channel weights. The experimental result shows that using 3-channel decompositions can get a better recognition rate than 1-channel recognition of the speech signals. The average recognition rate is also more superior than the filter-banking method and MFCC method by 3.54% and 1.95% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call