Several speaker verification techniques based on different features of the human voice and with verification accuracy better than 99% have been proposed in the literature. Most of these systems work well only when the reference and test speech utterances are recorded under the same relatively noise-free conditions. The goal of this study was to develop a speaker verification system based on an orthogonal linear prediction model, that adopts a unified approach by utilizing only one set of reference parameters irrespective of whether the test speech data is of high quality or noise corrupted. Initial investigations reveal that significant improvement in accuracy is achieved if the verification algorithm utilizes only those parameters that are least sensitive to additive broadband noise. However, for reference and test utterances spoken over the telephone, satisfactory verification is realized only when all the orthogonal parameters are utilized in distance computation. It is finally shown that a significant improvement in accuracy is realizable if the test utterances (whether corrupted by additive noise or by passage through noisy telephone channels) are preprocessed by a modified adaptive noise cancellation filter.
Read full abstract