Abstract

The wave collation display is a pitch-synchronous, time-domain visual speech display. Collation processing maps the speech waveform into a planar array, condensing the waveform and making speech information, including pitch contour and formant transitions, salient. Evaluation included both analytic evaluation and training. Analytic evaluation was based on a perceptual sorting task using untrained subjects. Subjects sorted printed speech display tokens by visual similarity in a match-to-exemplar design. Stimuli included vowels, with single speaker, multiple speakers, and multiple phonemic contexts, and voiceless consonants. Results for untrained subjects ranged from 73% correct (consonants) and 71% correct (vowels) for single speaker tokens to 46% correct (multiple speaker vowels). For comparison, analytic evaluation using spectrograms was also performed for vowels with single and multiple speakers. Overall results were statistically equivalent to the collation display, with 76% correct (single speaker vowels) and 44% correct (multiple speakers). In the training component, four subjects were trained on collation display sorting tasks as above; after mastering these tasks, generalization to novel stimuli was tested. The tasks were mastered in a few hours, and generalization to novel tokens from a familiar speaker was nearly perfect; generalization to unfamiliar speakers was imperfect.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.