Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information

Ozlem Kalinli,Shrikanth Narayanan

doi:10.1109/tasl.2009.2014795

Abstract

Auditory attention is a complex mechanism that involves the processing of low-level acoustic cues together with higher level cognitive cues. In this paper, a novel method is proposed that combines biologically inspired auditory attention cues with higher level lexical and syntactic information to model task-dependent influences on a given spoken language processing task. A set of low-level multiscale features (intensity, frequency contrast, temporal contrast, orientation, and pitch) is extracted in parallel from the auditory spectrum of the sound based on the processing stages in the central auditory system to create feature maps that are converted to auditory gist features that capture the essence of a sound scene. The auditory attention model biases the gist features in a task-dependent way to maximize target detection in a given scene. Furthermore, the top-down task-dependent influence of lexical and syntactic information is incorporated into the model using a probabilistic approach. The lexical information is incorporated by using a probabilistic language model, and the syntactic knowledge is modeled using part-of-speech (POS) tags. The combined model is tested on automatically detecting prominent syllables in speech using the BU Radio News Corpus. The model achieves 88.33% prominence detection accuracy at the syllable level and 85.71% accuracy at the word level. These results compare well with reported human performance on this task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Jul 1, 2009
Citations: 131

Similar Papers

Lexical and syntactic representations in the brain: An fMRI investigation with multi-voxel pattern analyses
Evelina Fedorenko ... Nancy Kanwisher
Neuropsychologia | VOL. 50
Evelina Fedorenko, et. al.Evelina Fedorenko ... Nancy Kanwisher
17 Sep 2011
Neuropsychologia | VOL. 50

An investigation on syntactic disambiguation in Mandarin speech perception and the phonological status of the disyllabic foot
Ivan Chow
Journal of Chinese Linguistics | VOL. 46
Ivan ChowIvan Chow
01 Jan 2018
Journal of Chinese Linguistics | VOL. 46

An investigation on syntactic disambiguation in Mandarin speech perception and the phonological status of the disyllabic foot
Ivan Chow
Journal of Chinese Linguistics | VOL. 46
Ivan ChowIvan Chow
01 Jan 2018
Journal of Chinese Linguistics | VOL. 46

Chinese Hedge Scope Detection Based on Structure and Semantic Information
Huiwei Zhou ... Long Chen
-
Huiwei Zhou, et. al.Huiwei Zhou ... Long Chen
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing