Abstract
Like all such measurements, the source data for speaker recognition is subject to errors in measurement due to transducer, channel, and quantization effects. When training and test equipment are known, one may calibrate the hardware and introduce a noise compensation procedure into the recognition process. In many applications, it is highly desirable for speaker recognition tasks to function with a wide variety of unknown equipment, making calibration impractical. A study is presented with the a priori assumption that for each feature vector o↘̂ observed with measurement noise, an error compensated vector o↘̂ lies within some uniformly distributed interval ±ε of the observed vector. A statistic derived from the observation set is computed and used to estimate an empirical interval in the neighborhood of each observation. An approximation to integration over the interval is carried out and is used in place of the density measurement at o↘. Tests using a 15-component cepstral feature vector derived from telephone quality speech (King corpus, San Diego speakers, sessions 1–5) have shown reductions of error rate on the order of 15% as compared to a baseline system. Techniques to reduce the algorithmic cost of the integration will also be discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.