Abstract

A new approach is described to machine speech recognition that incorporates nonlinear spectral interferometry to model the binaural advantage in human speech recognition. This ESR scheme uses interharmonic and interear visibility observables from the spectra of phonemes to provide phoneme identification signatures. In particular, the phase spectra of phonemes are found to be coupled, but not redundant to, the amplitude spectra. Use of amplitude and phase spectra allows a critical resolution of phoneme ambiguities often present when derived from amplitude spectra alone. Interharmonic phase modulation appears to be a secondary means of encoding human speech and is primarily used to discern the speaker's identification and mood.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call