Self-organizing maps for measuring similarity of audio-visual speech percepts

Hans‐Heinrich Bothe

doi:10.1121/1.4788569

Abstract

The goal of this work is to find a way to measure similarity of audio-visual speech percepts. Phoneme-related self-organizing maps (SOM) with a rectangular basis are trained with data material from a (labeled) video film. For the training, a combination of auditory speech features and corresponding visual lip features is used. Phoneme-related receptive fields result on the SOM basis; they are speaker dependent and show individual locations and strain. Overlapping main slopes indicate a high similarity of respective units; distortion or extra peaks originate from the influence of other units. Dependent on the training data, these other units may also be contextually immediate neighboring units. The poster demonstrates the idea with text material spoken by one individual subject using a set of simple audio-visual features. The data material for the training process consists of 44 labeled sentences in German with a balanced phoneme repertoire. As a result it can be stated that (i) the SOM can be trained to map auditory and visual features in a topology-preserving way and (ii) they show strain due to the influence of other audio-visual units. The SOM can be used to measure similarity amongst audio-visual speech percepts and to measure coarticulatory effects.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-organizing maps for measuring similarity of audio-visual speech percepts

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Abstract Coding of Audiovisual Speech: Beyond Sensory Representation
Uri Hasson ... Steven L Small
Neuron | VOL. 56
Uri Hasson, et. al.Uri Hasson ... Steven L Small
01 Dec 2007
Neuron | VOL. 56

Top‐down control of audiovisual search by bimodal search templates
Pawel J Matusz ... Martin Eimer
Psychophysiology | VOL. 50
Pawel J Matusz, et. al.Pawel J Matusz ... Martin Eimer
09 Jul 2013
Psychophysiology | VOL. 50

Cross-modal facilitation in speech prosody
Jessica M Foxton ... Louis-David Riviere
Cognition | VOL. 115
Jessica M Foxton, et. al.Jessica M Foxton ... Louis-David Riviere
16 Dec 2009
Cognition | VOL. 115

Auditory-visual integration during multimodal object recognition in humans: a behavioral and electrophysiological study.
M H Giard ... F Peronnet
Journal of Cognitive Neuroscience | VOL. 11
M H Giard, et. al.M H Giard ... F Peronnet
01 Sep 1999
Journal of Cognitive Neuroscience | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-organizing maps for measuring similarity of audio-visual speech percepts

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America