An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition

Tarun Pruthi,Carol Y Espy-Wilson

doi:10.21437/interspeech.2006-193

Abstract

The goal of this paper is to explore the effects of changes in velar coupling area and oral cavity configuration on the poles and zeros introduced in the nasalized vowel and nasal consonant spectra due to the sphenoidal and maxillary sinuses. MRI data for the vocal tract and nasal tract of one speaker was used to simulate the spectra of the nasalized vowels , and nasal consonants with different coupling areas. It is shown that during nasalized vowels, the frequencies of both poles and zeros due to the sinuses change with a change in the velar coupling area or the vowel. It is also shown that during nasal consonants, the zero frequencies are constant, and the pole frequencies are more stable as compared to nasalized vowels. This study, therefore, corroborates the use of nasal consonant spectra for speaker recognition and raises doubts on the potential benefits of using nasalization during vowels for that purpose. Index Terms: speaker recognition, nasal, sinus, MRI. The nasal cavity is probably the most complicated structure involved in the production of speech. Unlike the oral cavity, the nasal cavity is divided into two parallel passages which end with the two nostrils. The nasal cavity also has several paranasal cavities called sinuses. Humans have four kinds of sinuses: Maxillary Sinus (MS), Frontal Sinus (FS), Sphenoidal Sinus (SS) and Ethomoidal Sinus (ES). These sinuses are connected to the main nasal passages through small openings called ostia. Coupling between the nasal tract and the vocal tract (oral cavity and pharyngeal cavity) is controlled by a movable fold called the velum. It has been shown that the asymmetry between the two nasal passages can introduce extra poles and zeros in the acoustic spectrum [1]. It has also been shown that the maxillary sinuses account for the lowest pole-zero pair seen in the acoustic spectrum (especially for low vowels) when nasalization is introduced [2, 3], and they are also very important in making speech sound nasal [4]. Despite several studies, the exact dynamics of the poles and zeros due to the sinuses are unclear. In this study, MRI data for the vocal tract and nasal tract of one speaker recorded by Story et al [5, 6] was used to simulate the spectral effects of SS and MS (since these were the only two sinuses for which data was recorded). This study is focused towards understanding the movement of the poles and zeros due to the sinuses with a change in the velar coupling area and the oral cavity configuration. Four vowels ( ) and two nasal consonants ( ) were considered in this study. Analysis of MRI data shows that not only the frequencies of the poles, but also the frequencies of the zeros due to sinuses during the nasalized vowel regions change with a change in the velar coupling area and a change in the vowel. The frequencies of the zeros due to the sinuses, however, stay at the same location during nasal consonant regions. Several researchers in the past have shown the effectiveness of the nasal consonantal regions for speaker recognition. The power spectrum during the nasal consonant regions was used in [7] for the purposes of speaker recognition. Features extracted from nasal consonant spectra were also used in [8] for speaker recognition. In another paper [9], coarticulation between the nasal and the following vowel was used as a cue for speaker recognition. The authors showed that using their coarticulation measure worked better than using the nasal spectrum alone. Other studies on the relative speaker discriminating properties of phonemes [10, 11, 12, 13] have shown that nasals and vowels perform the best. Although several researchers have shown that nasal consonant regions give reliable cues for speaker recognition, no one has used nasality during the vowel regions as a cue. In light of the analysis in this paper, a question arises: Does nasalization during vowels provide a good cue for speaker recognition?

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

ＣＴ画像による副鼻腔体積の検討上顎洞，前頭洞，蝶形骨洞の比較
Atsuko Ikeda ... Atsushi Komatsuzaki
Practica Oto-Rhino-Laryngologica | VOL. 90
Atsuko Ikeda, et. al.Atsuko Ikeda ... Atsushi Komatsuzaki
01 Jan 1997
Practica Oto-Rhino-Laryngologica | VOL. 90

Combined Aplasia of Sphenoid, Frontal, and Maxillary Sinuses With Hypoplasia of The Ethmoid Sinus
Tolga Kandogan ... Ozgur Esen
Iranian Red Crescent Medical Journal | VOL. 15
Tolga Kandogan, et. al.Tolga Kandogan ... Ozgur Esen
01 Jan 2013
Iranian Red Crescent Medical Journal | VOL. 15

Volumetric analysis of the maxillary, sphenoid and frontal sinuses: A comparative computerized tomography based study
Oded Cohen ... Yonatan Lahav
Auris Nasus Larynx | VOL. 45
Oded Cohen, et. al.Oded Cohen ... Yonatan Lahav
22 Mar 2017
Auris Nasus Larynx | VOL. 45

Clinical diagnosis, treatment and medical evaluation of barosinusitis in aircrew
...
-
, et. al. ...
15 Dec 2009
15 Dec 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition

Abstract

Talk to us

Similar Papers