Abstract

Two experiments were conducted to identify acoustic features for speaker identity verification (SIV) that are used by humans and not by cepstral-based algorithms. Although these algorithms generally out-perform human listeners for randomly selected comparisons between single-word utterances, this approach was to analyze human performance on comparisons that could not be effectively discriminated by machine. Experiment 1 showed that humans could perform at high levels of accuracy on these comparisons suggesting that either information exists that is not captured by the algorithms, or that the information is coded by the algorithms but is not used effectively. The second experiment consisted of three stimulus conditions for SIV; digitized speech signals, noise-excited resynthesized LPC signals, and error prediction signals from the LPC. Results indicated high levels of performance in the natural and error prediction signal conditions and performance near chance in the noise excited condition, thus suggesting that the error signal provides valuable information that allows humans to distinguish between speakers. It may be possible to improve verification algorithms by adapting current models to more accurately utilize information used by human listeners.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.